import React from 'react';
import './css/GettingStarted.css'; // Import your CSS file for styling
import { Link } from 'react-router-dom';
import image1 from "../images/9mlops.JPG";
import image2 from "../images/10deep.png";
import image3 from "../images/11forecasting.png";
import image4 from '../images/12scatter.png';

function ModelBuildingStep() {
    const scrollToSection = (sectionId) => {
        const element = document.getElementById(sectionId);
        if (element) {
          element.scrollIntoView({ behavior: 'smooth' });
        }
      };
  return (
    <div className="project-details">
      <h2>Model Building and Algorithm Selection</h2>
      <p>
        In the "Build MB" step, you can choose from four powerful options to build and refine your machine learning models and equations. Each option is designed to cater to specific modeling needs. Below are instructions for each of the available options:
      </p>
      <ul style={{textDecoration:'none'}}>
        <li>
          <a onClick={() => scrollToSection('mlops')}>
            MLOps for Machine Learning Regression and Classification Algorithms
          </a>
        </li>
        <li>
          <a onClick={() => scrollToSection('dl')}>
            Deep Learning
          </a>
        </li>
        <li>
          <a onClick={() => scrollToSection('forecasting')}>
            Forecasting for Time Series Data Algorithms
          </a>
        </li>
        <li>
          <a onClick={() => scrollToSection('scatter')}>
            Scatter Plot Method
          </a>
        </li>
      </ul>

      <h3 id="mlops">MLOps for Machine Learning Regression and Classification Algorithms</h3>
      <p>
        <strong>Building Machine Learning Models</strong>
      </p>
      <p>
        <strong>To begin building machine learning models, select the "MLOps" option.</strong>
      </p>
      <p>
        <strong>Select Independent and Dependent Variables:</strong> On the interface, you'll find a list of independent and dependent variables loaded from your data. Independent variables are factors that influence the dependent variable.
      </p>

      <p>
        <strong>Choose Regression and Classification Algorithms:</strong> For each selected dependent variable, you can choose from a dropdown list of regression and classification algorithms. These algorithms are tailored to handle specific types of data and predictive tasks. Select the algorithms that align with your modeling objectives.
      </p>
      <p>
        <strong>Specify Train Data Percentage:</strong> In the "Train Data %" input box, specify the percentage of your dataset that will be used for training the machine learning models. Typically, a portion of your data is reserved for training, and the remaining data is used for testing or validation.
      </p>
      <p>
        <strong>Hyperparameter Tuning (Optional):</strong> If you wish to fine-tune the hyperparameters of your selected algorithms, explore the hyperparameter tuning options. Adjusting hyperparameters can enhance the performance of your models. Experiment with different hyperparameter settings to optimize your results.
      </p>
      <p>
        <strong>Train Your Models:</strong> After configuring the variables, algorithms, and hyperparameters, initiate the training process. The software will use the specified percentage of training data to build and optimize your machine learning models.
      </p>

      <img src={image1} className='imagestyling'/>

      <h3>Algorithms</h3>

      <h4>MBCV - Model Builder for LassoCV (Lasso Cross-Validation) Algorithm</h4>
      <p>
        Lasso (Least Absolute Shrinkage and Selection Operator) regression is a linear regression technique that incorporates L1 regularization. This regularization term adds a penalty to the linear regression coefficients, encouraging the model to select a subset of the most relevant features while setting less important features' coefficients to zero. Lasso helps prevent overfitting and can perform feature selection.
      </p>
      <p>
        <strong>For detailed information on how to work with LassoCV Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBLR - Model Builder for Linear Regression Algorithm</h4>
      <p>
        Linear Regression is a fundamental supervised machine learning algorithm used for modeling the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a linear equation to the observed data.
      </p>
      <p>
        <strong>For detailed information on how to work with Linear Regression Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBPR- Model Builder for Polynomial Regression Algorithm
</h4>
      <p>Polynomial Regression is a variation of linear regression that allows you to model the relationship between a dependent variable and one or more independent variables as an nth-degree polynomial rather than a straight line. In other words, it fits a polynomial equation to the data instead of a linear equation. This makes it more flexible in capturing nonlinear relationships between variables.
</p>
      <p>
        <strong>For detailed information on how to work with Polynomial Regression Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>
MBRF- Model Builder for Random Forest Algorithm

</h4>
    <p>Random Forest is a powerful ensemble learning algorithm widely used for both classification and regression tasks in machine learning. It is known for its ability to provide robust and accurate predictions while addressing issues like overfitting.
</p>
      <p>
        <strong>For detailed information on how to work with Random Forest Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>
      MBKN- Model Builder for k-Nearest Neighbors (k-NN) Algorithm
</h4>
<p>The k-Nearest Neighbors (k-NN) algorithm is a versatile and simple machine learning algorithm used for classification and regression tasks. It is a non-parametric and instance-based learning method, meaning it doesn't make assumptions about the underlying data distribution and relies on the stored instances in the training dataset for making predictions
</p>

      <p>
        <strong>For detailed information on how to work with  k-Nearest Neighbors (k-NN) Hyperparameters
 Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>





      {/* Include similar content for other algorithms... */}

      <h4>MBDT- Model Builder for Decision Tree Regression Algorithm</h4>
      <p>
        Decision Tree Regressor is a regression algorithm that models data using a tree-like structure of decision nodes and leaf nodes. It recursively splits the data into subsets based on the features' values, with each split chosen to minimize the variance of the target variable within each subset. The predicted value for each leaf node is typically the mean of the target values of the data points in that node. Decision trees can capture nonlinear relationships in the data but can be prone to overfitting when they become too deep.
      </p>
      <p>
        <strong>For detailed information on how to work with Decision Tree Regression Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBML- Model Builder for MultiTask LassoCV Algorithm</h4>
      <p>
        Multi-Task LassoCV is a variant of Lasso regression designed for multi-task learning, where multiple related tasks are learned simultaneously. It uses L1 regularization to encourage sparsity in the coefficients for each task, effectively selecting important features and tasks. The algorithm automatically selects the optimal regularization strength (alpha) through cross-validation. Multi-Task LassoCV is used when there are multiple related tasks (e.g., predicting multiple dependent variables), and it encourages shared feature selection across tasks.
      </p>
      <p>
        <strong>For detailed information on how to work with Multi Task Lasso CV Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.MultiTaskLassoCV.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBX- Model Builder for XGBoost (Extreme Gradient Boosting)</h4>
      <p>
        XGBoost is a popular and powerful ensemble learning algorithm known for its efficiency and effectiveness in various machine learning tasks, including classification and regression. It belongs to the boosting family of algorithms, which combines the predictions of multiple weak learners (typically decision trees) to create a strong ensemble model. XGBoost optimizes the performance metric (e.g., mean squared error for regression or log-loss for classification) by iteratively adding decision trees to the ensemble. Each tree is designed to correct the errors made by the previous ones.
      </p>
      <p>
        <strong>For detailed information on how to work with XGBoost Hyperparameters, <a href="https://xgboost.readthedocs.io/en/stable/parameter.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBSVM- Model Builder for Support Vector Regression Algorithm</h4>
      <p>
        Support Vector Regression (SVR) is a variation of Support Vector Machines (SVM) used for regression tasks. SVR aims to find a hyperplane that maximizes the margin (distance) between the predicted values and the actual target values while allowing a certain level of error. It uses a kernel function to transform the feature space into a higher-dimensional space, allowing SVR to capture nonlinear relationships. SVR also includes a regularization term to control the smoothness of the regression function and prevent overfitting.
      </p>
      <p>
        <strong>For detailed information on how to work with Support Vector Regression Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBMLP- Model Builder for Multi Layer Perceptron Algorithm</h4>
      <p>
        MLPRegressor stands for Multi-layer Perceptron Regressor, and it is a type of artificial neural network model used for regression tasks. It is a part of the scikit-learn library in Python. MLPRegressor is based on feedforward neural networks and is capable of learning complex relationships between input features and target values.
      </p>
      <p>
        <strong>For detailed information on how to work with MLPRegressor Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>MBKNN- Model Builder for Logistic Regression Algorithm</h4>
      <p>
        Logistic Regression is a widely used statistical and machine learning algorithm for binary classification tasks, where the goal is to predict one of two possible outcomes (usually labeled as 0 and 1). Despite its name, logistic regression is a classification algorithm, not a regression algorithm.
      </p>
      <p>
        <strong>For detailed information on how to work with Logistic Regression Hyperparameters, <a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h3 id="dl">Deep Learning</h3>
      <p>
        <strong>Choose the "Deep Learning" option to work with deep neural networks.</strong>
      </p>
      <p>
        <strong>Training and Optimization:</strong> Train your deep learning models and fine-tune them for optimal performance. This option empowers you to work on complex and data-intensive problems.
      </p>
      <p>
        <strong>Train Data Percentage (train data %):</strong> The "train data %" parameter represents the portion of the available dataset that is used for training the neural network.
      </p>
      <p>
        <strong>Number of Layers (no of layers):</strong> The "number of layers" parameter specifies the architecture of the neural network in terms of how many layers it contains.
      </p>
      <p>
        <strong>Number of Epochs (no of epochs):</strong> An "epoch" refers to one complete pass through the entire training dataset during the training process.
      </p>
      <p>
        <strong>Batch Size (batch size):</strong> During training, data is processed in batches rather than processing the entire dataset at once. The "batch size" parameter determines the number of data samples processed in each batch.
      </p>
      <img src={image2} className='imagestyling'/>

      <h3 id="forecasting">Forecasting for Time Series Data Algorithms</h3>
      <p>
        Select the "Forecasting for Time Series Data Algorithms" option to build models for time series data.
      </p>

      <h3>Time Series Analysis</h3>
      <p>
        Analyze your time series data, perform trend analysis, and choose the most suitable forecasting algorithms for accurate predictions.
      </p>

      <h3>Model Training and Evaluation</h3>
      <p>
        Train your forecasting models and evaluate their performance using historical data to make informed future predictions.
      </p>

      <img src={image3} className='imagestyling'/>

      
      <h2>Time Series Algorithms</h2>
      
      <h4>AR- Model Builder for LassoCV (Lasso Cross-Validation) Algorithm</h4>
      <p>
        AutoReg is commonly used for modeling and forecasting time series data in various domains, including finance, economics, and environmental science. It is especially useful when you suspect that past values of a time series have a significant influence on future values.
      </p>
      <p>
        <strong>For detailed information on how to work with Auto Regression Hyperparameters, <a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.ar_model.AutoReg.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>ARIMA- Model Builder for Autoregressive Integrated Moving Average Algorithm</h4>
      <p>
        ARIMA, which stands for Autoregressive Integrated Moving Average, is a popular and powerful time series forecasting model used to analyze and predict time-dependent data. It combines three key components: autoregression (AR), differencing (I for integrated), and moving average (MA) to capture various patterns and trends in time series data.
      </p>
      <p>
        <strong>For detailed information on how to work with Auto Regression Hyperparameters, <a href="https://www.statsmodels.org/stable/generated/statsmodels.tsa.arima.model.ARIMA.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>SARIMAX- Model Builder for Seasonal Autoregressive Integrated Moving Average Algorithm</h4>
      <p>
        SARIMA is an extension of the ARIMA model, designed to handle seasonal time series data. It combines autoregressive (AR), differencing (I), and moving average (MA) components with the added capability of modeling seasonality.
      </p>
      <p>
        <strong>For detailed information on how to work with Seasonal Autoregressive Integrated Moving Average Hyperparameters, <a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>
      <h4>ES- Model Builder for Exponential Smoothing (ES) Algorithm</h4>
      <p>
        Exponential Smoothing is a family of forecasting methods that exponentially weigh past observations to make predictions. It's particularly useful for time series data with trends and seasonality.
      </p>
      <p>
        <strong>For detailed information on how to work with Exponential Smoothing (ES) Hyperparameters, <a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.holtwinters.ExponentialSmoothing.html" target="_blank" rel="noopener noreferrer">click here</a>.</strong>
      </p>

      <h4>PROPHET- Model Builder for Prophet Algorithm</h4>
      <p>
        Prophet is a forecasting algorithm developed by Facebook for time series data with daily observations that may exhibit seasonality and holiday effects. It's designed to be user-friendly and capable of handling missing data and outliers.
      </p>

      <h3 id="scatter">Scatter Plot Method</h3>
      <p>
        The "Scatter Plot Method" is a powerful tool for visual data analysis that allows you to gain insights into your dataset and identify underlying patterns. This method is particularly useful when you want to understand the relationships between variables and make data-driven decisions.
      </p>

      <h3>Choose Equation Type:</h3>
      <p>
        One of the key features of the Scatter Plot Method is the ability to fit equations to your data. You can choose from various equation types, including:
      </p>
      <ul>
        <li><strong>Linear Equation:</strong> Use a linear equation when you suspect a linear relationship between variables. This equation type represents a straight line on the scatter plot.</li>
        <li><strong>Exponential Equation:</strong> Opt for an exponential equation when your data exhibits exponential growth or decay patterns. It is commonly used to model phenomena where the rate of change is proportional to the current value.</li>
        <li><strong>Logarithmic Equation:</strong> A logarithmic equation is suitable for data that shows diminishing returns or growth that slows over time. It can help you uncover intricate relationships that may not be apparent in a linear view.</li>
        <li><strong>Hyperbolic Equation:</strong> When you encounter data with asymptotic behavior, a hyperbolic equation can provide insights. It describes situations where values approach a constant limit as they get larger.</li>
        <li><strong>Quadratic Equation:</strong> Choose a quadratic equation for data that displays a curved or parabolic relationship. It can capture the curvature and nonlinearity in your dataset.</li>
      </ul>

      <h3>Interactive Plotting:</h3>
      <p>
        With the Scatter Plot Method, you can create scatter plots of your data and interactively adjust parameters to find the best-fit equation that describes your data accurately. This dynamic approach allows you to fine-tune your understanding of the relationships within your dataset and make informed decisions based on the insights gained.
      </p>

      <p>
        Whether you are exploring correlations, identifying outliers, or making predictions, the Scatter Plot Method provides a valuable visual tool for data analysis and modeling.
      </p>

      <img src={image4} className='imagestyling'/>

    </div>
  );
}

export default ModelBuildingStep;
