Big Data Formats

Updated: Feb 21, 2020

Smart companies don't let novices play with their most critical resource, their data.

Different file formats and how to read them in Python?

Comma-separated values



Plain Text (txt)





Hierarchical Data Format





What is a file format?

A file format is a standard way in which information is encoded for storage in a file. First, the file format specifies whether the file is a binary or ASCII file. Second, it shows how the information is organized. For example, comma-separated values (CSV) file format stores tabular data in plain text.

To identify a file format, you can usually look at the file extension to get an idea. For example, a file saved with name “Data” in “CSV” format will appear as “Data.csv”. By noticing “.csv” extension we can clearly identify that it is a “CSV” file and data is stored in a tabular format.

Recent Posts

See All

Dijkstra shortest path algorithm

Word ladder game (change only one letter to go from Fool to Sage): Fool, Pool, Poll, Pole, Pale, Sale, Sage. How? Dijkstra shortest path algorithm

Deep Learning for Algorithmic Trading

Finance is highly nonlinear and sometimes stock price data can even seem completely random. Machine learning and Deep Learning have found their place in the financial institutions for their power in p

Statistical Arbitrage Trading Pairs

What are z score values? A Z score is the value of a supposedly normal random variable when we subtract the mean and divide by the standard deviation, thus scaling it to the standard normal distributi

©2020 by Arturo Devesa.