1. Install libraries through pip in separate cell. You can specify --user at the end of instruction to provide admin privileges if required. Here -q specifies quiet (for no logs).
!pip install -q datascience # Package required by pandas profiling
!pip install -q pandas-profiling # Pandas Profiling
2. Update your pandas_profiling package in separate cell. You can mention --user at the end of instruction to provide admin privileges if required.
!pip install -q --upgrade pandas_profiling # Upgrade pandas profiling
3. Restart the Kernel and don't execute the step 1 and 2 cells (Ignore the warnings if appearing)
4. Import the libraries as:
import pandas as pd
import numpy as np
from pandas_profiling import ProfileReport
5. Read the data as:
data = pd.read_csv('Data.csv')
6. Perform Profiling as:
profile = ProfileReport(data)
profile
7. If you're working on a large dataset, then you need to specify the minimal parameter while performing pandas profiling.
profile = ProfileReport(data, minimal = True)
profile
Or, try running the pandas profiling on a subset of data, say 500 rows and if it works, try it on a larger set, and eventually on the entire dataset.
Please visit the pandas-profiling documentation for further information: https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/pages/installation.html
Comments
0 comments
Article is closed for comments.