Creates a pipeline. For a batch pipeline, you can pass scheduler information. Data Pipelines uses the scheduler information to create an internal scheduler that runs jobs periodically. If the internal scheduler is not configured, you can use RunPipeline to run jobs.
Scopes
You will need authorization for the https://www.googleapis.com/auth/cloud-platform scope to make a valid call.
If unset, the scope for this method defaults to https://www.googleapis.com/auth/cloud-platform.
You can set the scope for this method like this: datapipelines1 --scope <scope> projects locations-pipelines-create ...
Required Scalar Argument
- <parent> (string)
- Required. The location name. For example:
projects/PROJECT_ID/locations/LOCATION_ID
.
- Required. The location name. For example:
Required Request Value
The request value is a data-structure with various fields. Each field may be a simple scalar or another data-structure. In the latter case it is advised to set the field-cursor to the data-structure's field to specify values more concisely.
For example, a structure like this:
GoogleCloudDatapipelinesV1Pipeline:
create-time: string
display-name: string
job-count: integer
last-update-time: string
name: string
pipeline-sources: { string: string }
schedule-info:
next-job-time: string
schedule: string
time-zone: string
scheduler-service-account-email: string
state: string
type: string
workload:
dataflow-flex-template-request:
launch-parameter:
container-spec-gcs-path: string
environment:
additional-experiments: [string]
additional-user-labels: { string: string }
enable-streaming-engine: boolean
flexrs-goal: string
ip-configuration: string
kms-key-name: string
machine-type: string
max-workers: integer
network: string
num-workers: integer
service-account-email: string
subnetwork: string
temp-location: string
worker-region: string
worker-zone: string
zone: string
job-name: string
launch-options: { string: string }
parameters: { string: string }
transform-name-mappings: { string: string }
update: boolean
location: string
project-id: string
validate-only: boolean
dataflow-launch-template-request:
gcs-path: string
launch-parameters:
environment:
additional-experiments: [string]
additional-user-labels: { string: string }
bypass-temp-dir-validation: boolean
enable-streaming-engine: boolean
ip-configuration: string
kms-key-name: string
machine-type: string
max-workers: integer
network: string
num-workers: integer
service-account-email: string
subnetwork: string
temp-location: string
worker-region: string
worker-zone: string
zone: string
job-name: string
parameters: { string: string }
transform-name-mapping: { string: string }
update: boolean
location: string
project-id: string
validate-only: boolean
can be set completely with the following arguments which are assumed to be executed in the given order. Note how the cursor position is adjusted to the respective structures, allowing simple field names to be used most of the time.
-r . create-time=et
- Output only. Immutable. The timestamp when the pipeline was initially created. Set by the Data Pipelines service.
display-name=magna
- Required. The display name of the pipeline. It can contain only letters ([A-Za-z]), numbers ([0-9]), hyphens (-), and underscores (_).
job-count=90
- Output only. Number of jobs.
last-update-time=ipsum
- Output only. Immutable. The timestamp when the pipeline was last modified. Set by the Data Pipelines service.
name=voluptua.
- The pipeline name. For example:
projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID
. *PROJECT_ID
can contain letters ([A-Za-z]), numbers ([0-9]), hyphens (-), colons (:), and periods (.). For more information, see Identifying projects. *LOCATION_ID
is the canonical ID for the pipeline's location. The list of available locations can be obtained by callinggoogle.cloud.location.Locations.ListLocations
. Note that the Data Pipelines service is not available in all regions. It depends on Cloud Scheduler, an App Engine application, so it's only available in App Engine regions. *PIPELINE_ID
is the ID of the pipeline. Must be unique for the selected project and location.
- The pipeline name. For example:
pipeline-sources=key=at
- Immutable. The sources of the pipeline (for example, Dataplex). The keys and values are set by the corresponding sources during pipeline creation.
- the value will be associated with the given
key
schedule-info next-job-time=sanctus
- Output only. When the next Scheduler job is going to run.
schedule=sed
- Unix-cron format of the schedule. This information is retrieved from the linked Cloud Scheduler.
-
time-zone=amet.
- Timezone ID. This matches the timezone IDs used by the Cloud Scheduler API. If empty, UTC time is assumed.
-
.. scheduler-service-account-email=takimata
- Optional. A service account email to be used with the Cloud Scheduler job. If not specified, the default compute engine service account will be used.
state=amet.
- Required. The state of the pipeline. When the pipeline is created, the state is set to 'PIPELINE_STATE_ACTIVE' by default. State changes can be requested by setting the state to stopping, paused, or resuming. State cannot be changed through UpdatePipeline requests.
type=duo
- Required. The type of the pipeline. This field affects the scheduling of the pipeline and the type of metrics to show for the pipeline.
workload.dataflow-flex-template-request.launch-parameter container-spec-gcs-path=ipsum
- Cloud Storage path to a file with a JSON-serialized ContainerSpec as content.
environment additional-experiments=gubergren
- Additional experiment flags for the job.
- Each invocation of this argument appends the given value to the array.
additional-user-labels=key=lorem
- Additional user labels to be specified for the job. Keys and values must follow the restrictions specified in the labeling restrictions. An object containing a list of key/value pairs. Example:
{ "name": "wrench", "mass": "1kg", "count": "3" }
. - the value will be associated with the given
key
- Additional user labels to be specified for the job. Keys and values must follow the restrictions specified in the labeling restrictions. An object containing a list of key/value pairs. Example:
enable-streaming-engine=false
- Whether to enable Streaming Engine for the job.
flexrs-goal=dolor
- Set FlexRS goal for the job. https://cloud.google.com/dataflow/docs/guides/flexrs
ip-configuration=ea
- Configuration for VM IPs.
kms-key-name=ipsum
- Name for the Cloud KMS key for the job. Key format is: projects//locations//keyRings//cryptoKeys/
machine-type=invidunt
- The machine type to use for the job. Defaults to the value from the template if not specified.
max-workers=54
- The maximum number of Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000.
network=duo
- Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default".
num-workers=51
- The initial number of Compute Engine instances for the job.
service-account-email=sed
- The email address of the service account to run the job as.
subnetwork=ut
- Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL.
temp-location=gubergren
- The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with
gs://
.
- The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with
worker-region=rebum.
- The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with worker_zone. If neither worker_region nor worker_zone is specified, defaults to the control plane region.
worker-zone=est
- The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with worker_region. If neither worker_region nor worker_zone is specified, a zone in the control plane region is chosen based on available capacity. If both
worker_zone
andzone
are set,worker_zone
takes precedence.
- The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with worker_region. If neither worker_region nor worker_zone is specified, a zone in the control plane region is chosen based on available capacity. If both
-
zone=ipsum
- The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, worker_zone will take precedence.
-
.. job-name=ipsum
- Required. The job name to use for the created job. For an update job request, the job name should be the same as the existing running job.
launch-options=key=est
- Launch options for this Flex Template job. This is a common set of options across languages and templates. This should not be used to pass job parameters.
- the value will be associated with the given
key
parameters=key=gubergren
- The parameters for the Flex Template. Example:
{"num_workers":"5"}
- the value will be associated with the given
key
- The parameters for the Flex Template. Example:
transform-name-mappings=key=ea
- Use this to pass transform name mappings for streaming update jobs. Example:
{"oldTransformName":"newTransformName",...}
- the value will be associated with the given
key
- Use this to pass transform name mappings for streaming update jobs. Example:
-
update=false
- Set this to true if you are sending a request to update a running streaming job. When set, the job name should be the same as the running job.
-
.. location=lorem
- Required. The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to which to direct the request. For example,
us-central1
,us-west1
.
- Required. The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to which to direct the request. For example,
project-id=eos
- Required. The ID of the Cloud Platform project that the job belongs to.
-
validate-only=false
- If true, the request is validated but not actually executed. Defaults to false.
-
..dataflow-launch-template-request gcs-path=sed
- A Cloud Storage path to the template from which to create the job. Must be a valid Cloud Storage URL, beginning with 'gs://'.
launch-parameters.environment additional-experiments=duo
- Additional experiment flags for the job.
- Each invocation of this argument appends the given value to the array.
additional-user-labels=key=sed
- Additional user labels to be specified for the job. Keys and values should follow the restrictions specified in the labeling restrictions page. An object containing a list of key/value pairs. Example: { "name": "wrench", "mass": "1kg", "count": "3" }.
- the value will be associated with the given
key
bypass-temp-dir-validation=true
- Whether to bypass the safety checks for the job's temporary directory. Use with caution.
enable-streaming-engine=true
- Whether to enable Streaming Engine for the job.
ip-configuration=et
- Configuration for VM IPs.
kms-key-name=et
- Name for the Cloud KMS key for the job. The key format is: projects//locations//keyRings//cryptoKeys/
machine-type=vero
- The machine type to use for the job. Defaults to the value from the template if not specified.
max-workers=70
- The maximum number of Compute Engine instances to be made available to your pipeline during execution, from 1 to 1000.
network=sed
- Network to which VMs will be assigned. If empty or unspecified, the service will use the network "default".
num-workers=81
- The initial number of Compute Engine instances for the job.
service-account-email=dolore
- The email address of the service account to run the job as.
subnetwork=et
- Subnetwork to which VMs will be assigned, if desired. You can specify a subnetwork using either a complete URL or an abbreviated path. Expected to be of the form "https://www.googleapis.com/compute/v1/projects/HOST_PROJECT_ID/regions/REGION/subnetworks/SUBNETWORK" or "regions/REGION/subnetworks/SUBNETWORK". If the subnetwork is located in a Shared VPC network, you must use the complete URL.
temp-location=voluptua.
- The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with
gs://
.
- The Cloud Storage path to use for temporary files. Must be a valid Cloud Storage URL, beginning with
worker-region=amet.
- The Compute Engine region (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1". Mutually exclusive with worker_zone. If neither worker_region nor worker_zone is specified, default to the control plane's region.
worker-zone=consetetur
- The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with worker_region. If neither worker_region nor worker_zone is specified, a zone in the control plane's region is chosen based on available capacity. If both
worker_zone
andzone
are set,worker_zone
takes precedence.
- The Compute Engine zone (https://cloud.google.com/compute/docs/regions-zones/regions-zones) in which worker processing should occur, e.g. "us-west1-a". Mutually exclusive with worker_region. If neither worker_region nor worker_zone is specified, a zone in the control plane's region is chosen based on available capacity. If both
-
zone=diam
- The Compute Engine availability zone for launching worker instances to run your pipeline. In the future, worker_zone will take precedence.
-
.. job-name=dolor
- Required. The job name to use for the created job.
parameters=key=et
- The runtime parameters to pass to the job.
- the value will be associated with the given
key
transform-name-mapping=key=et
- Map of transform name prefixes of the job to be replaced to the corresponding name prefixes of the new job. Only applicable when updating a pipeline.
- the value will be associated with the given
key
-
update=false
- If set, replace the existing pipeline with the name specified by jobName with this pipeline, preserving state.
-
.. location=stet
- The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) to which to direct the request.
project-id=dolor
- Required. The ID of the Cloud Platform project that the job belongs to.
validate-only=false
- If true, the request is validated but not actually executed. Defaults to false.
About Cursors
The cursor position is key to comfortably set complex nested structures. The following rules apply:
- The cursor position is always set relative to the current one, unless the field name starts with the
.
character. Fields can be nested such as in-r f.s.o
. - The cursor position is set relative to the top-level structure if it starts with
.
, e.g.-r .s.s
- You can also set nested fields without setting the cursor explicitly. For example, to set a value relative to the current cursor position, you would specify
-r struct.sub_struct=bar
. - You can move the cursor one level up by using
..
. Each additional.
moves it up one additional level. E.g....
would go three levels up.
Optional Output Flags
The method's return value a JSON encoded structure, which will be written to standard output by default.
- -o out
- out specifies the destination to which to write the server's result to.
It will be a JSON-encoded structure.
The destination may be
-
to indicate standard output, or a filepath that is to contain the received bytes. If unset, it defaults to standard output.
- out specifies the destination to which to write the server's result to.
It will be a JSON-encoded structure.
The destination may be
Optional General Properties
The following properties can configure any call, and are not specific to this method.
-
-p $-xgafv=string
- V1 error format.
-
-p access-token=string
- OAuth access token.
-
-p alt=string
- Data format for response.
-
-p callback=string
- JSONP
-
-p fields=string
- Selector specifying which fields to include in a partial response.
-
-p key=string
- API key. Your API key identifies your project and provides you with API access, quota, and reports. Required unless you provide an OAuth 2.0 token.
-
-p oauth-token=string
- OAuth 2.0 token for the current user.
-
-p pretty-print=boolean
- Returns response with indentations and line breaks.
-
-p quota-user=string
- Available to use for quota purposes for server-side applications. Can be any arbitrary string assigned to a user, but should not exceed 40 characters.
-
-p upload-type=string
- Legacy upload protocol for media (e.g. "media", "multipart").
-
-p upload-protocol=string
- Upload protocol for media (e.g. "raw", "multipart").