Big Data Formats

Updated: Feb 21, 2020

Smart companies don't let novices play with their most critical resource, their data.

Different file formats and how to read them in Python?

Comma-separated values



Plain Text (txt)





Hierarchical Data Format





What is a file format?

A file format is a standard way in which information is encoded for storage in a file. First, the file format specifies whether the file is a binary or ASCII file. Second, it shows how the information is organized. For example, comma-separated values (CSV) file format stores tabular data in plain text.

To identify a file format, you can usually look at the file extension to get an idea. For example, a file saved with name “Data” in “CSV” format will appear as “Data.csv”. By noticing “.csv” extension we can clearly identify that it is a “CSV” file and data is stored in a tabular format.

Recent Posts

See All

A drug target is a molecule in the body, usually a protein, that is intrinsically associated with a particular disease process and that could be addressed. A biological target is anything within a liv

DevOps is a collaboration of the development (Dev) and operations (Ops) teams with its foundation depending on providing IT automation. DevOps is an agile methodology that includes a set of practices