Crawling

Crawled information

= Remarks =

Term

Meaning

Script comments

These are comments that are set for the Script of DataSpider.

Comments for tables

These are comments that are set for the table of a database.

Reference source information of a view

This is an SQL statement that defines the view of a database.

Column information of constraints

This is a constraint that is granted for the column of a database.

You can check the following constraints:

  • NOT NULL constraint

  • UNIQUE constraint

  • Primary Key

  • Foreign Key

In the case of JDBC, some information on constraints cannot be obtained depending on resources.

The crawled information is as follows:

 

For DataSpider

  • Resources list (global resources)

  • Project information

    • Project names

  • Script information

    • Script names

    • Script comments

= Remarks =

In the "Resources list (global resources)," only the following are targets of crawling:

  • Types of global resources: Database

    • Db2

    • MySQL

    • Oracle

    • PostgreSQL

    • SQL Server

    • JDBC

  • Types of global resources: Cloud

    • Amazon RDS for MySQL

    • Amazon RDS for Oracle

    • Amazon RDS for PostgreSQL

    • Amazon RDS for SQL Server

 

For PostgreSQL, Oracle, SQL Server, MySQL, Db2, and JDBC

  • Schema information (other than MySQL)

    • Schema names

  • Table and view information

    • Table names and view names

    • Comments for tables and views (other than SQL Server)

    • Updated date

    • Number of records

    • Reference source information of views (other than JDBC)

  • Column information

    • Column names

    • Data types

    • Column capacity

    • Comments for columns (other than SQL Server)

    • Column information of constraints

 

Amazon S3

  • Bucket information

    • Bucket names

  • Folder and file information

    • File names and folder names

    • Paths

    • Sizes

    • Updated date

 

Azure Blob Storage

  • Container information

    • Container names

  • Folder and file information

    • File names and folder names

    • URLs

    • Sizes

    • Updated date

 

Google Cloud Storage

  • Bucket information

    • Bucket names

  • Folder and file information

    • File names and folder names

    • Paths

    • Sizes

    • Updated date

 

Google BigQuery

  • Dataset information

    • Dataset names

    • Comments for datasets

  • Table and view information

    • Table names and view names

    • Comments for tables and views

    • Updated date

    • Number of records

    • Reference source information of a view

  • Field information

    • Field names

    • Data types

    • Field capacity

    • Comments for fields

    • Modes for fields

 

Method for crawling manually

To crawl at the timing of your choice, perform the following procedure:

1. Click [Resource] from the header menu.

2. Select the resources you want to crawl, and click [Crawling].

= Remarks =

Depending on the amount of data that is crawled, it may take some time to obtain the data.

3. Confirm that "Success" is displayed in the status column on the resources list screen.

= Remarks =

You can also check the crawling results by checking the log files. Refer to the setup manual for details on log files.