File Coverage

blib/lib/Paws/MachineLearning/RDSDataSpec.pm

Criterion	Covered	Total	%
statement	3	3	100.0
branch			n/a
condition			n/a
subroutine	1	1	100.0
pod			n/a
total	4	4	100.0

line	stmt	sub	time	code
1				package Paws::MachineLearning::RDSDataSpec;
2	1	1	475	use Moose;
	1		4
	1		8
3				has DatabaseCredentials => (is => 'ro', isa => 'Paws::MachineLearning::RDSDatabaseCredentials', required => 1);
4				has DatabaseInformation => (is => 'ro', isa => 'Paws::MachineLearning::RDSDatabase', required => 1);
5				has DataRearrangement => (is => 'ro', isa => 'Str');
6				has DataSchema => (is => 'ro', isa => 'Str');
7				has DataSchemaUri => (is => 'ro', isa => 'Str');
8				has ResourceRole => (is => 'ro', isa => 'Str', required => 1);
9				has S3StagingLocation => (is => 'ro', isa => 'Str', required => 1);
10				has SecurityGroupIds => (is => 'ro', isa => 'ArrayRef[Str\|Undef]', required => 1);
11				has SelectSqlQuery => (is => 'ro', isa => 'Str', required => 1);
12				has ServiceRole => (is => 'ro', isa => 'Str', required => 1);
13				has SubnetId => (is => 'ro', isa => 'Str', required => 1);
14				1;
15
16				### main pod documentation begin ###
17
18				=head1 NAME
19
20				Paws::MachineLearning::RDSDataSpec
21
22				=head1 USAGE
23
24				This class represents one of two things:
25
26				=head3 Arguments in a call to a service
27
28				Use the attributes of this class as arguments to methods. You shouldn't make instances of this class.
29				Each attribute should be used as a named argument in the calls that expect this type of object.
30
31				As an example, if Att1 is expected to be a Paws::MachineLearning::RDSDataSpec object:
32
33				$service_obj->Method(Att1 => { DatabaseCredentials => $value, ..., SubnetId => $value });
34
35				=head3 Results returned from an API call
36
37				Use accessors for each attribute. If Att1 is expected to be an Paws::MachineLearning::RDSDataSpec object:
38
39				$result = $service_obj->Method(...);
40				$result->Att1->DatabaseCredentials
41
42				=head1 DESCRIPTION
43
44				The data specification of an Amazon Relational Database Service (Amazon
45				RDS) C<DataSource>.
46
47				=head1 ATTRIBUTES
48
49
50				=head2 B<REQUIRED> DatabaseCredentials => L<Paws::MachineLearning::RDSDatabaseCredentials>
51
52				The AWS Identity and Access Management (IAM) credentials that are used
53				connect to the Amazon RDS database.
54
55
56				=head2 B<REQUIRED> DatabaseInformation => L<Paws::MachineLearning::RDSDatabase>
57
58				Describes the C<DatabaseName> and C<InstanceIdentifier> of an Amazon
59				RDS database.
60
61
62				=head2 DataRearrangement => Str
63
64				A JSON string that represents the splitting and rearrangement
65				processing to be applied to a C<DataSource>. If the
66				C<DataRearrangement> parameter is not provided, all of the input data
67				is used to create the C<Datasource>.
68
69				There are multiple parameters that control what data is used to create
70				a datasource:
71
72				=over
73
74				=item *
75
76				B<C<percentBegin>>
77
78				Use C<percentBegin> to indicate the beginning of the range of the data
79				used to create the Datasource. If you do not include C<percentBegin>
80				and C<percentEnd>, Amazon ML includes all of the data when creating the
81				datasource.
82
83				=item *
84
85				B<C<percentEnd>>
86
87				Use C<percentEnd> to indicate the end of the range of the data used to
88				create the Datasource. If you do not include C<percentBegin> and
89				C<percentEnd>, Amazon ML includes all of the data when creating the
90				datasource.
91
92				=item *
93
94				B<C<complement>>
95
96				The C<complement> parameter instructs Amazon ML to use the data that is
97				not included in the range of C<percentBegin> to C<percentEnd> to create
98				a datasource. The C<complement> parameter is useful if you need to
99				create complementary datasources for training and evaluation. To create
100				a complementary datasource, use the same values for C<percentBegin> and
101				C<percentEnd>, along with the C<complement> parameter.
102
103				For example, the following two datasources do not share any data, and
104				can be used to train and evaluate a model. The first datasource has 25
105				percent of the data, and the second one has 75 percent of the data.
106
107				Datasource for evaluation: C<{"splitting":{"percentBegin":0,
108				"percentEnd":25}}>
109
110				Datasource for training: C<{"splitting":{"percentBegin":0,
111				"percentEnd":25, "complement":"true"}}>
112
113				=item *
114
115				B<C<strategy>>
116
117				To change how Amazon ML splits the data for a datasource, use the
118				C<strategy> parameter.
119
120				The default value for the C<strategy> parameter is C<sequential>,
121				meaning that Amazon ML takes all of the data records between the
122				C<percentBegin> and C<percentEnd> parameters for the datasource, in the
123				order that the records appear in the input data.
124
125				The following two C<DataRearrangement> lines are examples of
126				sequentially ordered training and evaluation datasources:
127
128				Datasource for evaluation: C<{"splitting":{"percentBegin":70,
129				"percentEnd":100, "strategy":"sequential"}}>
130
131				Datasource for training: C<{"splitting":{"percentBegin":70,
132				"percentEnd":100, "strategy":"sequential", "complement":"true"}}>
133
134				To randomly split the input data into the proportions indicated by the
135				percentBegin and percentEnd parameters, set the C<strategy> parameter
136				to C<random> and provide a string that is used as the seed value for
137				the random data splitting (for example, you can use the S3 path to your
138				data as the random seed string). If you choose the random split
139				strategy, Amazon ML assigns each row of data a pseudo-random number
140				between 0 and 100, and then selects the rows that have an assigned
141				number between C<percentBegin> and C<percentEnd>. Pseudo-random numbers
142				are assigned using both the input seed string value and the byte offset
143				as a seed, so changing the data results in a different split. Any
144				existing ordering is preserved. The random splitting strategy ensures
145				that variables in the training and evaluation data are distributed
146				similarly. It is useful in the cases where the input data may have an
147				implicit sort order, which would otherwise result in training and
148				evaluation datasources containing non-similar data records.
149
150				The following two C<DataRearrangement> lines are examples of
151				non-sequentially ordered training and evaluation datasources:
152
153				Datasource for evaluation: C<{"splitting":{"percentBegin":70,
154				"percentEnd":100, "strategy":"random",
155				"randomSeed"="s3://my_s3_path/bucket/file.csv"}}>
156
157				Datasource for training: C<{"splitting":{"percentBegin":70,
158				"percentEnd":100, "strategy":"random",
159				"randomSeed"="s3://my_s3_path/bucket/file.csv", "complement":"true"}}>
160
161				=back
162
163
164
165				=head2 DataSchema => Str
166
167				A JSON string that represents the schema for an Amazon RDS
168				C<DataSource>. The C<DataSchema> defines the structure of the
169				observation data in the data file(s) referenced in the C<DataSource>.
170
171				A C<DataSchema> is not required if you specify a C<DataSchemaUri>
172
173				Define your C<DataSchema> as a series of key-value pairs. C<attributes>
174				and C<excludedVariableNames> have an array of key-value pairs for their
175				value. Use the following format to define your C<DataSchema>.
176
177				{ "version": "1.0",
178
179				"recordAnnotationFieldName": "F1",
180
181				"recordWeightFieldName": "F2",
182
183				"targetFieldName": "F3",
184
185				"dataFormat": "CSV",
186
187				"dataFileContainsHeader": true,
188
189				"attributes": [
190
191				{ "fieldName": "F1", "fieldType": "TEXT" }, { "fieldName": "F2",
192				"fieldType": "NUMERIC" }, { "fieldName": "F3", "fieldType":
193				"CATEGORICAL" }, { "fieldName": "F4", "fieldType": "NUMERIC" }, {
194				"fieldName": "F5", "fieldType": "CATEGORICAL" }, { "fieldName": "F6",
195				"fieldType": "TEXT" }, { "fieldName": "F7", "fieldType":
196				"WEIGHTED_INT_SEQUENCE" }, { "fieldName": "F8", "fieldType":
197				"WEIGHTED_STRING_SEQUENCE" } ],
198
199				"excludedVariableNames": [ "F6" ] }
200
201
202				=head2 DataSchemaUri => Str
203
204				The Amazon S3 location of the C<DataSchema>.
205
206
207				=head2 B<REQUIRED> ResourceRole => Str
208
209				The role (DataPipelineDefaultResourceRole) assumed by an Amazon Elastic
210				Compute Cloud (Amazon EC2) instance to carry out the copy operation
211				from Amazon RDS to an Amazon S3 task. For more information, see Role
212				templates for data pipelines.
213
214
215				=head2 B<REQUIRED> S3StagingLocation => Str
216
217				The Amazon S3 location for staging Amazon RDS data. The data retrieved
218				from Amazon RDS using C<SelectSqlQuery> is stored in this location.
219
220
221				=head2 B<REQUIRED> SecurityGroupIds => ArrayRef[Str\|Undef]
222
223				The security group IDs to be used to access a VPC-based RDS DB
224				instance. Ensure that there are appropriate ingress rules set up to
225				allow access to the RDS DB instance. This attribute is used by Data
226				Pipeline to carry out the copy operation from Amazon RDS to an Amazon
227				S3 task.
228
229
230				=head2 B<REQUIRED> SelectSqlQuery => Str
231
232				The query that is used to retrieve the observation data for the
233				C<DataSource>.
234
235
236				=head2 B<REQUIRED> ServiceRole => Str
237
238				The role (DataPipelineDefaultRole) assumed by AWS Data Pipeline service
239				to monitor the progress of the copy task from Amazon RDS to Amazon S3.
240				For more information, see Role templates for data pipelines.
241
242
243				=head2 B<REQUIRED> SubnetId => Str
244
245				The subnet ID to be used to access a VPC-based RDS DB instance. This
246				attribute is used by Data Pipeline to carry out the copy task from
247				Amazon RDS to Amazon S3.
248
249
250
251				=head1 SEE ALSO
252
253				This class forms part of L<Paws>, describing an object used in L<Paws::MachineLearning>
254
255				=head1 BUGS and CONTRIBUTIONS
256
257				The source code is located here: https://github.com/pplu/aws-sdk-perl
258
259				Please report bugs to: https://github.com/pplu/aws-sdk-perl/issues
260
261				=cut
262