Search

Big Data Formats

Updated: Feb 21, 2020


Smart companies don't let novices play with their most critical resource, their data.


Different file formats and how to read them in Python?

Comma-separated values

XLSX

ZIP

Plain Text (txt)

JSON

XML

HTML

Images

Hierarchical Data Format

PDF

DOCX

MP3

MP4


What is a file format?


A file format is a standard way in which information is encoded for storage in a file. First, the file format specifies whether the file is a binary or ASCII file. Second, it shows how the information is organized. For example, comma-separated values (CSV) file format stores tabular data in plain text.


To identify a file format, you can usually look at the file extension to get an idea. For example, a file saved with name “Data” in “CSV” format will appear as “Data.csv”. By noticing “.csv” extension we can clearly identify that it is a “CSV” file and data is stored in a tabular format.

Recent Posts

See All

Drug Discovery

A drug target is a molecule in the body, usually a protein, that is intrinsically associated with a particular disease process and that could be addressed. A biological target is anything within a liv

DevOps

DevOps is a collaboration of the development (Dev) and operations (Ops) teams with its foundation depending on providing IT automation. DevOps is an agile methodology that includes a set of practices