Data Types and File Structure Overview
The Flat Files S3 API provides access to various types of cryptocurrency market data. This document provides an overview of the available data types and the general file structure used across the system.
Available Data Types
Our API offers the following types of market data:
- Quotes
- Trades
- Limit Book Snapshots
- Full Limit Order Book
- OHLCV (Open, High, Low, Close, Volume) - Coming soon
Each data type is documented in detail in its respective article, including field descriptions, data formats, and usage examples. You will find the articles in left-side menu.
File Structure
Data in the S3 bucket is organized according to the following structure:
/
├── T-TRADES/
│ └── D-YYYYMMDD/
│ └── E-[EXCHANGE]/
│ └── IDDI-[IDENTIFIER]+SC-[COINAPI_SYMBOL_ID]+S-[EXCHANGE_SYMBOL].csv.gz
├── T-QUOTES/
│ └── D-YYYYMMDD/
│ └── E-[EXCHANGE]/
│ └── IDDI-[IDENTIFIER]+SC-[COINAPI_SYMBOL_ID]+S-[EXCHANGE_SYMBOL].csv.gz
├── T-LIMITBOOK_FULL/
│ └── D-YYYYMMDD/
│ └── E-[EXCHANGE]/
│ └── IDDI-[IDENTIFIER]+SC-[COINAPI_SYMBOL_ID]+S-[EXCHANGE_SYMBOL].csv.gz
└── T-OHLCV/
└── D-YYYYMMDD/
└── E-[EXCHANGE]/
└── IDDI-[IDENTIFIER]+SC-[COINAPI_SYMBOL_ID]+S-[EXCHANGE_SYMBOL].csv.gz
Where:
YYYYMMDD
represents the date of the data[EXCHANGE]
is the identifier for the specific exchange[IDENTIFIER]
is a unique identifier for the data file[COINAPI_SYMBOL_ID]
is the CoinAPI symbol identifier[EXCHANGE_SYMBOL]
is the symbol as used by the exchange
File Format
All data files are stored in CSV (Comma-Separated Values) format and compressed using gzip compression. This approach balances human readability with efficient storage and transfer sizes.
To use these files:
- Download the
.csv.gz
file - Decompress the file using a tool that supports gzip (e.g., gzip, 7-zip)
- Open the resulting CSV file in a spreadsheet application or process it with your preferred data analysis tool
Data Consistency and Synchronization
All timestamp fields across different data types are synchronized to ensure consistency when analyzing data from multiple sources. We use high-precision time synchronization to maintain accuracy across our data collection infrastructure.
Best Practices for Data Retrieval
- Use date-based partitioning to efficiently retrieve data for specific time periods.
- Leverage the prefix functionality in S3 listing operations to filter data by exchange, symbol, or date.
- Implement parallel downloads for large datasets to improve retrieval speed.
- Consider implementing local caching for frequently accessed data to reduce API calls and improve application performance.
For detailed information on each data type, please refer to the individual data type documentation.