### Python Basics
A #python-script is a text file with the `.py` extension.
- List of python commands as it would be typed in the terminal.
### Python Lists
A python #list is defined as follows `[a,b,c]`.
Lists can contain Lists!
```python
family = [['Andrew', 'Son'],
['Anthony', 'Dad'],
['Judith', 'Mom'],
['Raisha', 'Bae']]
```
**Indexing** in python is #zero-indexing which means it starts at 0
- `list[0]` returns the first element of the list
- `list[-1]` returns the last element of a list
- You can use +/- numbers to index forward/backward
**Slicing** lists in python creates subsets of a list
- `list[start:end]`
- start: inclusive
- end: exclusive
- You do not have to provide a start or an end index
#### Manipulating Lists
Changing list elements using **Indexing** and **Slicing**
```python
# Indexing
list[0] = 1
# Slicing'
list[0:3] = [1,2,3,4]
```
Adding/Removing Elements
```python
# Adding
[1] + [1,2,3,4,5] # Adding at the start
[1,2,3,4,5] + [1] # Adding at the end
# Removing
del(list[0]) # Indexing
del(list[0:3]) # Slicing
```
Referencing and instantiating variables in python works as follows.
![[Screenshot 2024-06-11 at 5.32.45 PM.png]]
Copying lists without referencing the old list
```python
x = [1,2,3]
# Copying x in y
y = list(x)
# OR
y = x[:]
```
### Functions and Packages
A **function** is a piece of reusable code!
```python
# Some Built-in python functions
max(x_list) # Gives the maximum value in a list
type(x_list) # Gives the type
round(1.68,1) # Rounds numbers
help(round) # Provides documentation!
len(x_list) # Provides the number of elements in a list
sorted(x_list) # Sorts a list
```
A **method** are python functions that belong to **objects**.
```python
# Some list methods
x_list.index('a') # Returns that index of the search parameter
x_list.count('a') # Returns the number of times 'a' appears in the list
x_list.append('a') # Adds 'a' to the end of the list
```
```python
# Some string methods
x_str.capitalize() # Capitalizes the first element of the string
x_str.replace('a','b') # Replaces all occurences of 'a' with 'b'
x_str.count('o') # counts the number of o's in x_str
```
An **attribute** is a static call (similar to a method) to an object ie `object.shape`.
A **package** is a directory of python scripts where each script is **module**.
A **module** specifies functions, methods, and types.
```python
# Importing Python packages
import numpy as np
# Importing specific modules in a python package
from numpy import array
```
### NumPy
**NumPy** is numeric python.
**SPEED!!!** since NumPy enforces a single data type, many of the below statistical calculations are much faster.
One advantage of a **NumPy** array over python #list is that you can perform mathematical operations on **NumPy** arrays. For a python #list you would need to loop through each value and perform the operation.
```python
import numpy as np
# In this example we are showing the difference between calculating BMI using a list vs a numpy array
height = [1, 2, 3, 4]
weight = [3, 4, 5, 6]
# Python List example
bmi = []
for i in range(len(height)):
bmi.append(weight[i]/(height[i]**2))
print(bmi)
# Numpy Array example
np_height = np.array(height)
np_weight = np.array(weight)
print(np_weight/np_height**2)
# Result
[3.0, 1.0, 0.5555555555555556, 0.375]
[3. 1. 0.55555556 0.375 ]
```
NumPy arrays behave similarly to python #list.
```python
# Cool subsetting in NumPy arrays similar to DataFrames
bmi[bmi > 23] # will return an array with elements greater than 23
```
**2D NumPy Arrays** are created using python list-of-lists!
```python
np_2d = [[1,2],
[3,4]]
np.array(np_2d)
```
**Subsetting 2D Arrays**
```python
np_2d[0][2] # returns the first row's third element
np_2d[0, 2] # same as above. Can be interpretted as first row third column
np_2d[:][1:3] # returns all rows, only columns 2 and 3
np_2d[:, 1:3] # same as above.
```
```python
# Datacamp useful exercise
import numpy as np
# Create np_baseball (2 cols)
np_baseball = np.array(baseball)
# Print out the 50th row of np_baseball
print(np_baseball[49])
# Select the entire second column of np_baseball: np_weight_lb
np_weight_lb = np_baseball[:, 1]
# Print out height of 124th player
print(np_baseball[123, 0])
```
**Statistics in NumPy**
Calculating the #mean in NumPy
```python
import numpy as np
np.mean(np_2d[:, 0]) # calculating the mean of the first column
```
Calculating the #median in NumPy
```python
import numpy as np
np.median(np_2d[:, 0]) # calculating the median of the first column
```
Calculating the #correlation in NumPy using the #pearson-correlation-coefficient or Pearson product-moment correlation coefficients
```python
import numpy as np
np.corrcoef(np_2d[:, 0], np_2d[:, 1] # calculating the correaltion of the first column and the second column
```
Calculating the #standard-deviation in NumPy
```python
import numpy as np
np.std(np_2d[:, 0]) # calculating the standard deviation of the first column
```
```python
# Nice statisitcs exercise from Datacamp
import numpy as np
# Convert positions and heights to numpy arrays: np_positions, np_heights
np_heights = np.array(heights)
np_positions = np.array(positions)
# Heights of the goalkeepers: gk_heights
gk_heights = np_heights[np_positions == 'GK']
# Heights of the other players: other_heights
other_heights = np_heights[np_positions != 'GK']
# Print out the median height of goalkeepers. Replace 'None'
print("Median height of goalkeepers: " + str(np.median(gk_heights)))
# Print out the median height of other players. Replace 'None'
print("Median height of other players: " + str(np.median(other_heights)))
```