### Transforming DataFrames
#### Introduction to DataFrames
#pandas is built off of #numpy and #matplotlib .
**Rectangular Data** or **Tabular Data** is data stored in row/column format. #pandas is designed to work with **Rectangular Data**, known as a DataFrame object.
```python
# Exploring a DataFrame
print(dogs.head()) # returns first 5 rows of dataframe
print(dogs.info()) # lists columns, datatypes, and # of non-null rows
print(dogs.shape) # tuple that contains # of rows then # of column
print(dogs.describe()) # calculates summary statistics of numerical columns
print(dogs.values) # returns 2D Numpy Array of the dataframe
print(dogs.columns) # returns column names
print(dogs.index) # returns RangeIndex describing the index
```
#### Sorting and subsetting
**Sorting** a DataFrame
```python
dogs.sort_values('weight_kg') # ascending
dogs.sort_values('weight_kg', ascending = False) # descending
dogs.sort_values(['weight_kg', 'height_cm']) # sorts by weight then height
dogs.sort_values(['weight_kg', 'height_cm'], ascending = [True, False])
```
**Subsetting columns**
```python
dogs['breed'] # subset one column
dogs[['breed', 'height_cm']] # multiple columns
```
**Subsetting rows**
```python
dogs[dogs['height_cm'] > 50] # get all dogs with heights gtr than 50cm
# Using Conditions
is_lab = dogs['breed'] == 'Labrador'
is_brown = dogs['color'] == 'Brown'
dogs[is_lab & is_brown]
# & -> and
# | -> or
# ~ -> not
# Multiple Categorical Conditions
dogs[dogs['color'].isin(['Black', 'Brown'])]
```
#### New Columns
Adding new columns has many names:
- mutating
- transforming
- feature engineering
```python
dogs['height_m'] = dogs['height_cm'] / 100
```
****
### Aggregating DataFrames
****
### Slicing and Indexing DataFrames
****
### Creating and Visualizing DataFrames