Crawl(Data Output/File)

Operation Name

Crawl(Data Output/File)

Function overview

Get a Web page file to pass data.

Data model

Data model of this component is XML type.

Properties

For information about using variables, refer to "variables".
Basic settings
Item name Required / Optional Use of Variables Description Remarks
Name Required Not available Enter the name on the script canvas.  
Required Settings
Item name Required / Optional Use of Variables Description Remarks
Destination Required Not available Select Global Resources.
Refer to "Global Resource Properties" to set up a global resource.
  • [Add]:
    Adds new global resource.
  • [Edit list]:
    Global resource settings can be edited with "Edit Resource list".
 
Path Required Available Enter the path to be connected.
If relative path ("/" Path does not start) is specified, "/" is automatically added to the beginning.
 
Method Required Not available Select the HTTP method.
  • [POST]:(default)
 
Content-Type Required Available Select or enter Content-Type for POST data .
  • [text/plain]: (default)
    plain text
  • [text/html]:
    HTML
  • [text/xml]:
    XML
  • [application/octet-stream]:
    Binary
 
POST file path Required Available Enter the file path.

Click [Browse] button to activate the file Selectr and Select the file.
Authentication
Item name Required / Optional Use of Variables Description Remarks
Authentication mode Required Not available Select the authentication mode.
  • [None]: (default)
    Do not use authentication.
  • [Basic authentication]:
    Basic authentication is used.
 
User name Optional Available Enter the user name used for basic authentication.
[Authentication mode] in [Basic authentication] is selected to activate.
 
Password Optional Available Enter the password of basic authentication.
[Authentication mode] in [Basic authentication] is selected to activate.
 
Cookie
Item name Required / Optional Use of Variables Description Remarks
Send cookie of subdomain to server Optional Not available Select whether to send subdomains Cookie.
  • [Checked]: (default)
    Send Subdomain Cookie saved.
  • [Not Checked]:
    Does not send Cookie subdomain saved.
  • The Cookie saving process will run automatically after each Web adapter process , and it is used for subsequent process of Web adapter.
  • when using the same global resources for subsequent processing in the adapter Web, even if [Not Checked] is selected cookie of the same domain will be also sent automatically .
  • [Checked] learn more about the "sub-domain for transmission Cookie".
Comment
Item name Required / Optional Use of Variables Description Remarks
Comment Optional Not available You can write a short description of this adapter.
The description will be reflected in the specifications.
 

Schema

Input Schema

None.

Output Schema

Varies depending on the Web page connecting.

Loading schema in Mapper

Schema needs to be loaded manually.
For any details, please refer to the "Editing Schema"

Mass Data Processing

Mass data processing is not supported.

Transaction

Transaction is not supported.

PSP Usage

PSP is not supported.

Available component variables

Component Variable name Description Remarks
message_category Stroes the category to which corresponding message code belongs to, when an error occurs.
  • The value defaults to null.
message_code Stores its corresponding message code of occured error.
  • The value defaults to null.
message_level Stores the severity of the corresponding message code of the occured error.
  • The value defaults to null.
error_type Returns the error type when error occurred.
  • The value defaults to null.
  • Error is represented in the format depicted below.
    Example:java.io.FileNotFoundException
  • The message may vary depending on the DataSpider Servista version.
error_message Return the error message when error occurred.
  • The value defaults to null.
  • The message may vary depending on the DataSpider Servista version.
error_trace Return trace information when error occurred.
  • The value defaults to null.
  • The message may vary depending on the DataSpider Servista version or the client application used.

About subdomain of Cookie sending

Specification Limits

Main exceptions

Exception Name Causes Solution
ResourceNotFoundException
Resource definition could not be found.Name: []
[Destination] is not specified. Please specify [Destination] .
ResourceNotFoundException
Resource definition could not be found.Name: [<name of Global Resources>]
Resource definition selected in [Destination] is not found. Please check the global resource specified in [Destination]
java.net.UnknownHostException Exception if Web server is not found. Please check the Web server configuration.
java.net.ConnectException Exception if can not Connect to Web Server. Check if port number or Web server is running.
org.apache.commons.httpclient.HttpConnection$ConnectionTimeoutException Connection timed out during connecting with the Web server. Confirm the state of network.Or, confirm [Time out] of global resource specified by [Destination].
org.apache.commons.httpclient.HttpRecoverableException
java.net.SocketTimeoutException: Read timed out
Connection timed out after connecting to Web server. Confirm the state of Web server.Or, confirm [Time out] of global resource specified by [Destination].

Notes