ML-Notes

From Depth Psychology Study Wiki
Revision as of 21:21, 26 December 2024 by SkyPanther (talk | contribs)

Creating Valid Data

When creating sample data, you need at least a 2D tensor/matrix. Because machine learning models require a feature dimension. ie (n, 1) where n is some sample, and 1 is the corresponding feature.

As an example:

For a house: (Sample: n, Feature: 3)

  • Sample:
    1. A specific house.
  • Features:
  1. Size: 1500 square feet.
  2. Bedrooms: 3.
  3. Location Index: 2 (e.g., urban area).

This is usually done with the unsqueeze dim=1 property for a range. ie:

X = torch.arange(0, 1, 0.02).unsqueeze(dim=1)

The torch.arange creates a matrix of 50 samples, but no features - the unsqueeze at the first dimension adds 1 dimension to the tensor/matrix.

Setting the Algorithm

This particular problem uses the Linear Regression algorithm, which is rendered like this in code:

y = weight * X + bias

Creating Training/Testing Split

Normally, when training split the data 80/20. 80 for training, and 20 for testing.

Something like:

train_split = int(0.8 * len(X)) # this gets 80% of the current length of the dataset, and this needs to be an int
X_train, y_train = X[:train_split], y[:train_split] # : denotes the start of the index, up to 80%
X_test, y_test = X[train_split:], y[train_split:] # 80%: denotes at the end of the 80, to the end, which will be 20%

Creating/Inheriting Model class

When creating a model, you will need to import nn from torch, and in particular nn.Module.

Usually something like:

import torch
from torch import nn

You will have to subclass it, in a custom class, that uses the Module as a superclass.

class LinearRegressionModel(nn.Module): # nn.Module is the base class for all neural network modules in PyTorch, this is how it is inherited by the custom class
def __init__(self): # This is the constructor for the class, it is a way to initialize the class's attributes
super().__init__() # This is how we inherit from nn.Module, ensures the parent class’s constructor initializes properly.
self.weights = nn.Parameter(torch.randn(1,requires_grad=True,dtype=torch.float)) # creates a models parameter, using random
self.bias = nn.Parameter(torch.randn(1, requires_grad=True,dtype=torch.float)) # creates a models parameter, using random

def forward(self, x: torch.Tensor) -> torch.Tensor: # REQUIRED: Forward method is required for all nn.Module subclasses, it needs to overide the forward method in the nn.Module class

return self.weights * x + self.bias

Inside that you will need to initialize the the weights and biases, usually to random or zero, and set the forward loop. The forward loop is required.

After that is created, you will need to initialize the loss function, and the optimizer (and which paramars you are optimizing.)

Then, in the training loop, you will need to set the model to train mode, do a forward propagation, calculate the loss, set the gradient accumulation to zero, do the backward propagation, and then the step function.

Once this is done you can do a test, using model eval, and a forward pass on the test data, then calculate the loss, and see the results (on previously unseen data)