Friday, June 24, 2022

Time Series Data

A time series is a set of data points indexed or listed in time order. Many financial datasets are stored as a time series due to the fact that financial data typically consists of observations at a specific time.

Time series data can be either structured or semi-structured. Imagine you’re receiving location data in records from a taxi’s GPS tracking device at regular time intervals. The data might arrive in the following format:

[

{

"cab": "cab_238",

"coord": (43.602508,39.715685),

"tm": "14:47",

"state": "available"

},

{

"cab": "cab_238",

"coord": (43.613744,39.705718),

"tm": "14:48",

"state": "available"

}

...

]

A new data record arrives every minute that includes the latest location coordinates (latitude/longitude) from cab_238. Each record has the same sequence of fields, and each field has a consistent structure from one record to the next, allowing you to store this time series data in a relational database table as regular structured data.

Now suppose the data comes at unequal intervals, which is often the case in practice, and that you receive more than one set of coordinates in one minute. The incoming structure might look like this:

[

{

"cab": "cab_238",

"coord": [(43.602508,39.715685),(43.602402,39.709672)],

"tm": "14:47",

"state": "available"

},

{

"cab": "cab_238",

"coord": (43.613744,39.705718),

"tm": "14:48",

"state": "available"

}

]

Note that the first coord field includes two sets of coordinates and is thus not consistent with the second coord field. This data is semi-structured.

Share:

0 comments:

Post a Comment