Read CSV Data

Operation Name

Read CSV Data

Function Overview

Reads CVS format file from HDFS.

Data Model

Data model of this component is Table Model type.

Properties

For information about using variables, refer to "variables".
Basic settings
Item name Required/Optional Use of variables Description Remarks
Name Required Not available Enter the name on the script canvas.  
Required settings
Item name Required/Optional Use of variables Description Remarks
Destination Required Not available Select a global resource.
  • [Add]:
    Adds a new global resource.
  • [Edit list]:
    Enables to edit the global resource settings in the "Edit resource list" screen.
 
HDFS file Path Required Available Enters HDFS file path.
  • The following characters are not available.
    • space < > " ^ [ ] { } % | ` : ;
  • Multibyte characters are not available.
Column list Optional - Specify the column.

Each column can be operated with the following buttons.
  • [Add]:
    Adds a column.
  • [Up]:
    Moves the order of the selected column upwards by one.
  • [Down]:
    Moves the order of the selected column downwards by one.
  • [Delete]:
    Deletes a column.
  • Read data of all the columns set in [Column list].
  • Column name specified in [Column list] is displayed in schema of Mapper.
Column list/Column name Required Available Input the name of the column.

You can set the first row of the file specified in [HDFS file Path] of property action of [Update column list] as the column name.
 
Property action
Item name Description Remarks
Update column list Set the first line of the file specified in [HDFS file path] as the column name.
  • Invalid when specified file does not exist, or variables are set in [HDFS file path].
Get column name from the first row Select the file in the file Selectr and set the first line of the file as the column name.  
Get column count Select the file in the file Selectr and set the number of columns of the file as the number of columns of the column name.  
Read settings
Item name Required/Optional Use of variables Description Remarks
Encoding Required Available Select or input read file encode.

Use only the canonical name supported by Java SE Runtime Environment 8 when specifying the encoding directly in the field.
Refer to "Supported Encodings"(http://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html) for details.
  • Default value is "UTF-8".
Do not read the first line as a value Optional Not available Select whether to treat the first line of the specified file as the data.
  • [Checked]:
    Not as data.
  • [Not Checked]: (default)
    As data.
 
Data processing method
Item name Required/Optional Use of variables Description Remarks
Mass data processing Required Not available Select a data processing method.
  • [Use script settings]: (default)
    Applies mass data processing settings of script property to adapter.
  • [Disable]:
    Mass data processing is not performed.
  • [Enable]:
    Mass data processing is performed.
 
Comment
Item name Required/Optional Use of variables Description Remarks
Comment Optional Not available You can write a short description of this adapter.
The description will be reflected in the specifications.
 

Schema

Input Schema

None.

Output Schema

Depending on the [Column list] settings, the number of columns may be different.
See "Schema of Table Model" for details regarding schema structure.

Loading Schema in Mapper

Schema is loaded automatically.
See "Edit Schema" for any details.

Mass Data Processing

Mass data processing is supported.

PSP Usage

PSP is supported.
For details on PSP, refer to "Parallel Stream Processing".

Available Component Variables

Component variable name Description Remarks
count Return number of read columns.
  • The value defaults to null.
  • Null when using Parallel Stream Processing.
filePath Return file path of read file.
  • The value defaults to null.
message_category Stroes the category to which corresponding message code belongs to, when an error occurs.
  • The value defaults to null.
message_code Stores its corresponding message code of occured error.
  • The value defaults to null.
message_level Stores the severity of the corresponding message code of the occured error.
  • The value defaults to null.
  • Does not store values in PSP.
error_type Returns the error type when error occurred.
  • The value defaults to null.
  • Error is represented in the format depicted below.
    Example:java.io.FileNotFoundException
  • The message may vary depending on DataSpider Servista version.
error_message Return the error message when error occurred.
  • The value defaults to null.
  • The message may vary depending on DataSpider Servista version.
error_trace Return trace information when error occurred.
  • The value defaults to null.
  • The message may vary depending on DataSpider Servista version or the client application used.

Null and Empty String

Main Exceptions

Exception name Causes Solution
ResourceNotFoundException
Resource definition is not found. Name:[]
[Destination] is not specified. Specify [Destination].
ResourceNotFoundException
Resource definition is not found. Name:[<Global resource name>]
Resource definition selected in [Destination] is not found. Check the global resource specified in [Destination].
InvalidPropertyConfigurationException
Property name is not specified.
[<Property name>] is not specified. Specify the [<Property name>]
FileIsDirectoryException Path input in [HDFS file Path] is directory. Input file path in [HDFS file Path] .
java.io.FileNotFoundException File specified in [HDFS file Path] does not exist. Check [HDFS file Path].
java.io.UnsupportedEncodingException Encodings that are not supported in [Encoding] are specified. Specify an encoding that is supported in Java SE Runtime Environment 8.
Refer to "Supported Encodings"(http://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html) for details.