OwenZhu's Blog

Tensorflow Serving Source Code Walkthrough

2019/05/25 Share

Tensorflow Serving Source Code Walkthrough

Model Servers Main

main.cc

flag_list: parsed tf model servers options, includes:

  • port: Port to listen on for gRPC API
  • grpc_socket_path: listen to a UNIX socket for gRPC API on the given path
  • rest_api_port: Port to listen on for HTTP/REST API.
  • rest_api_num_threads: Number of threads for HTTP/REST API processing.

  • rest_api_timeout_in_ms: Timeout for HTTP/REST API calls.

  • enable_batching: enable batching
  • batching_parameters_file: read an ascii BatchingParameters protobuf from the supplied file name and use the contained values instead of the defaults.
  • model_config_file: read an ascii ModelServerConfig protobuf from the supplied file name, and serve the models in that file. This config file can be used to specify multiple models to serve and other advanced parameters including non-default version policy. (If used, —model_name, —model_base_path are ignored.)
  • model_name: name of model (ignored if —model_config_file flag is set)
  • model_base_path: path to export (ignored if —model_config_file flag is set, otherwise required)
  • max_num_load_retries: maximum number of times it retries loading a model after the first failure, before giving up.
  • load_retry_interval_micros: The interval, in microseconds, between each servable load retry
  • file_system_poll_wait_seconds: Interval in seconds between each poll of the filesystem for new model version.
  • flush_filesystem_caches: If true (the default), filesystem caches will be flushed after the initial load of all servables, and after each subsequent individual servable reload. This reduces memory consumption of the model server, at the potential cost of cache misses if model files are accessed after servables are loaded.
  • tensorflow_session_parallelism: Number of threads to use for running a Tensorflow session. Auto-configured by default.
  • tensorflow_intra_op_parallelism: Number of threads to use to parallelize the execution of an individual op. Auto-configured by default.
  • tensorflow_inter_op_parallelism: Controls the number of operators that can be executed simultaneously. Auto-configured by default.
  • ssl_config_file: If non-empty, read an ascii SSLConfig protobuf from the supplied file name and set up a secure gRPC channel.
  • platform_config_file: If non-empty, read an ascii PlatformConfigMap protobuf from the supplied file name, and use that platform config instead of the Tensorflow platform.
  • per_process_gpu_memory_fraction: Fraction that each process occupies of the GPU memory space the value is between 0.0 and 1.0 (with 0.0 as the default) If 1.0, the server will allocate all the memory when the server starts, If 0.0, Tensorflow will automatically select a value.
  • saved_model_tags: Comma-separated set of tags corresponding to the meta graph def to load from SavedModel.
  • grpc_channel_arguments: A comma separated list of arguments to be passed to the grpc server.
  • enable_model_warmup: Enables model warmup, which triggers lazy initializations (such as TF optimizations) at load time, to reduce first request latency.
  • version: Display version.
  • monitoring_config_file: read an ascii MonitoringConfig protobuf from the supplied file name.
CATALOG
  1. 1. Tensorflow Serving Source Code Walkthrough
    1. 1.1. Model Servers Main
      1. 1.1.1. main.cc