Creates a DataScan resource.

Scopes

You will need authorization for the https://www.googleapis.com/auth/cloud-platform scope to make a valid call.

If unset, the scope for this method defaults to https://www.googleapis.com/auth/cloud-platform. You can set the scope for this method like this: dataplex1 --scope <scope> projects locations-data-scans-create ...

Required Scalar Argument

  • <parent> (string)
    • Required. The resource name of the parent location: projects/{project}/locations/{location_id} where project refers to a project_id or project_number and location_id refers to a GCP region.

Required Request Value

The request value is a data-structure with various fields. Each field may be a simple scalar or another data-structure. In the latter case it is advised to set the field-cursor to the data-structure's field to specify values more concisely.

For example, a structure like this:

GoogleCloudDataplexV1DataScan:
  create-time: string
  data:
    entity: string
    resource: string
  data-profile-result:
    post-scan-actions-result:
      bigquery-export-result:
        message: string
        state: string
    row-count: int64
    scanned-data:
      incremental-field:
        end: string
        field: string
        start: string
  data-profile-spec:
    exclude-fields:
      field-names: [string]
    include-fields:
      field-names: [string]
    post-scan-actions:
      bigquery-export:
        results-table: string
    row-filter: string
    sampling-percent: number
  data-quality-result:
    passed: boolean
    post-scan-actions-result:
      bigquery-export-result:
        message: string
        state: string
    row-count: int64
    scanned-data:
      incremental-field:
        end: string
        field: string
        start: string
    score: number
  data-quality-spec:
    post-scan-actions:
      bigquery-export:
        results-table: string
    row-filter: string
    sampling-percent: number
  description: string
  display-name: string
  execution-spec:
    field: string
    trigger:
      schedule:
        cron: string
  execution-status:
    latest-job-end-time: string
    latest-job-start-time: string
  labels: { string: string }
  name: string
  state: string
  type: string
  uid: string
  update-time: string

can be set completely with the following arguments which are assumed to be executed in the given order. Note how the cursor position is adjusted to the respective structures, allowing simple field names to be used most of the time.

  • -r . create-time=est
    • Output only. The time when the scan was created.
  • data entity=ipsum
    • Immutable. The Dataplex entity that represents the data source (e.g. BigQuery table) for DataScan, of the form: projects/{project_number}/locations/{location_id}/lakes/{lake_id}/zones/{zone_id}/entities/{entity_id}.
  • resource=ipsum

    • Immutable. The service-qualified full resource name of the cloud resource for a DataScan job to scan against. The field could be: BigQuery table of type "TABLE" for DataProfileScan/DataQualityScan Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID
  • ..data-profile-result.post-scan-actions-result.bigquery-export-result message=est

    • Output only. Additional information about the BigQuery exporting.
  • state=gubergren

    • Output only. Execution state for the BigQuery exporting.
  • ... row-count=-17

    • The count of rows scanned.
  • scanned-data.incremental-field end=dolor
    • Value that marks the end of the range.
  • field=lorem
    • The field that contains values which monotonically increases over time (e.g. a timestamp column).
  • start=eos

    • Value that marks the start of the range.
  • ....data-profile-spec.exclude-fields field-names=labore

    • Optional. Expected input is a list of fully qualified names of fields as in the schema.Only top-level field names for nested fields are supported. For instance, if 'x' is of nested field type, listing 'x' is supported but 'x.y.z' is not supported. Here 'y' and 'y.z' are nested fields of 'x'.
    • Each invocation of this argument appends the given value to the array.
  • ..include-fields field-names=sed

    • Optional. Expected input is a list of fully qualified names of fields as in the schema.Only top-level field names for nested fields are supported. For instance, if 'x' is of nested field type, listing 'x' is supported but 'x.y.z' is not supported. Here 'y' and 'y.z' are nested fields of 'x'.
    • Each invocation of this argument appends the given value to the array.
  • ..post-scan-actions.bigquery-export results-table=duo

    • Optional. The BigQuery table to export DataProfileScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID
  • ... row-filter=sed

    • Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10
  • sampling-percent=0.7957202632117738

    • Optional. The percentage of the records to be selected from the dataset for DataScan. Value can range between 0.0 and 100.0 with up to 3 significant decimal digits. Sampling is not applied if sampling_percent is not specified, 0 or 100.
  • ..data-quality-result passed=true

    • Overall data quality result -- true if all rules passed.
  • post-scan-actions-result.bigquery-export-result message=stet
    • Output only. Additional information about the BigQuery exporting.
  • state=kasd

    • Output only. Execution state for the BigQuery exporting.
  • ... row-count=-24

    • The count of rows processed.
  • scanned-data.incremental-field end=sed
    • Value that marks the end of the range.
  • field=et
    • The field that contains values which monotonically increases over time (e.g. a timestamp column).
  • start=et

    • Value that marks the start of the range.
  • ... score=0.8039909988714865

    • Output only. The overall data quality score.The score ranges between 0, 100 (up to two decimal points).
  • ..data-quality-spec.post-scan-actions.bigquery-export results-table=erat

    • Optional. The BigQuery table to export DataQualityScan results to. Format: //bigquery.googleapis.com/projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_ID
  • ... row-filter=sed

    • Optional. A filter applied to all rows in a single DataScan job. The filter needs to be a valid SQL expression for a WHERE clause in BigQuery standard SQL syntax. Example: col1 >= 0 AND col2 < 10
  • sampling-percent=0.6383502522516505

    • Optional. The percentage of the records to be selected from the dataset for DataScan. Value can range between 0.0 and 100.0 with up to 3 significant decimal digits. Sampling is not applied if sampling_percent is not specified, 0 or 100.
  • .. description=et

    • Optional. Description of the scan. Must be between 1-1024 characters.
  • display-name=voluptua.
    • Optional. User friendly display name. Must be between 1-256 characters.
  • execution-spec field=amet.
    • Immutable. The unnested field (of type Date or Timestamp) that contains values which monotonically increase over time.If not specified, a data scan will run for all data in the table.
  • trigger.schedule cron=consetetur

    • Required. Cron (https://en.wikipedia.org/wiki/Cron) schedule for running scans periodically.To explicitly set a timezone in the cron tab, apply a prefix in the cron tab: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database (wikipedia (https://en.wikipedia.org/wiki/List_of_tz_database_time_zones#List)). For example, CRON_TZ=America/New_York 1 * * , or TZ=America/New_York 1 * * .This field is required for Schedule scans.
  • ....execution-status latest-job-end-time=diam

    • The time when the latest DataScanJob ended.
  • latest-job-start-time=dolor

    • The time when the latest DataScanJob started.
  • .. labels=key=et

    • Optional. User-defined labels for the scan.
    • the value will be associated with the given key
  • name=et
    • Output only. The relative resource name of the scan, of the form: projects/{project}/locations/{location_id}/dataScans/{datascan_id}, where project refers to a project_id or project_number and location_id refers to a GCP region.
  • state=sadipscing
    • Output only. Current state of the DataScan.
  • type=stet
    • Output only. The type of DataScan.
  • uid=dolor
    • Output only. System generated globally unique ID for the scan. This ID will be different if the scan is deleted and re-created with the same name.
  • update-time=duo
    • Output only. The time when the scan was last updated.

About Cursors

The cursor position is key to comfortably set complex nested structures. The following rules apply:

  • The cursor position is always set relative to the current one, unless the field name starts with the . character. Fields can be nested such as in -r f.s.o .
  • The cursor position is set relative to the top-level structure if it starts with ., e.g. -r .s.s
  • You can also set nested fields without setting the cursor explicitly. For example, to set a value relative to the current cursor position, you would specify -r struct.sub_struct=bar.
  • You can move the cursor one level up by using ... Each additional . moves it up one additional level. E.g. ... would go three levels up.

Optional Output Flags

The method's return value a JSON encoded structure, which will be written to standard output by default.

  • -o out
    • out specifies the destination to which to write the server's result to. It will be a JSON-encoded structure. The destination may be - to indicate standard output, or a filepath that is to contain the received bytes. If unset, it defaults to standard output.

Optional Method Properties

You may set the following properties to further configure the call. Please note that -p is followed by one or more key-value-pairs, and is called like this -p k1=v1 k2=v2 even though the listing below repeats the -p for completeness.

  • -p data-scan-id=string

    • Required. DataScan identifier. Must contain only lowercase letters, numbers and hyphens. Must start with a letter. Must end with a number or a letter. Must be between 1-63 characters. Must be unique within the customer project / location.
  • -p validate-only=boolean

    • Optional. Only validate the request, but do not perform mutations. The default is false.

Optional General Properties

The following properties can configure any call, and are not specific to this method.

  • -p $-xgafv=string

    • V1 error format.
  • -p access-token=string

    • OAuth access token.
  • -p alt=string

    • Data format for response.
  • -p callback=string

    • JSONP
  • -p fields=string

    • Selector specifying which fields to include in a partial response.
  • -p key=string

    • API key. Your API key identifies your project and provides you with API access, quota, and reports. Required unless you provide an OAuth 2.0 token.
  • -p oauth-token=string

    • OAuth 2.0 token for the current user.
  • -p pretty-print=boolean

    • Returns response with indentations and line breaks.
  • -p quota-user=string

    • Available to use for quota purposes for server-side applications. Can be any arbitrary string assigned to a user, but should not exceed 40 characters.
  • -p upload-type=string

    • Legacy upload protocol for media (e.g. "media", "multipart").
  • -p upload-protocol=string

    • Upload protocol for media (e.g. "raw", "multipart").