
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>markokello</title>
      <link>https://markokello.com/blog</link>
      <description></description>
      <language>en-us</language>
      <managingEditor> ()</managingEditor>
      <webMaster> ()</webMaster>
      <lastBuildDate>Wed, 28 Aug 2019 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://markokello.com/tags/deep-learning/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://markokello.com/blog/Linear-algebra</guid>
    <title>Linear Algebra Concepts for Data Science and Machine Learning</title>
    <link>https://markokello.com/blog/Linear-algebra</link>
    <description>This blog explains the mathematics and theory behind key classical machine learning algorithms: Linear Regression, Logistic Regression, k-NN, Naive Bayes, and Decision Trees.</description>
    <pubDate>Wed, 28 Aug 2019 00:00:00 GMT</pubDate>
    <author> ()</author>
    <category>Maths</category><category>Statistics</category><category>Deep Learning</category>
  </item>

  <item>
    <guid>https://markokello.com/blog/derivatives</guid>
    <title>Derivatives, Partial Derivatives, Vector and Matrix Calculus</title>
    <link>https://markokello.com/blog/derivatives</link>
    <description>.</description>
    <pubDate>Wed, 30 Oct 2019 00:00:00 GMT</pubDate>
    <author> ()</author>
    <category>ML</category><category>Maths</category><category>Deep Learning</category>
  </item>

  <item>
    <guid>https://markokello.com/blog/gradient-descent</guid>
    <title>Gradient Descent Variants</title>
    <link>https://markokello.com/blog/gradient-descent</link>
    <description>G.D is an iterative optimization algorithm for training Machine Learning models with the primary purpose of finding the optimal parameters (weights and biases) for the model. The gradient of the loss function, also known as a vector of partial derivatives, indicates the steepest direction of increase. It is then repeatedly updated by taking a step in the opposite direction. The size of the step is controlled by the learning rate. And through this process, the algorithm gradually drives the loss value lower until it converges toward a local minimum.</description>
    <pubDate>Tue, 19 Nov 2019 00:00:00 GMT</pubDate>
    <author> ()</author>
    <category>ML</category><category>Deep Learning</category>
  </item>

  <item>
    <guid>https://markokello.com/blog/information-theory</guid>
    <title>Entropy, Cross-entropy, KL divergence and Beyond</title>
    <link>https://markokello.com/blog/information-theory</link>
    <description>Entropy measures the level of uncertainty or randomness in a dataset. Information gain, in turn, evaluates how effectively a decision tree split reduces this entropy. It measures the reduction in uncertainty achieved by a particular split, helping to identify which features create the most meaningful divisions in the data and lead to better classification decisions.</description>
    <pubDate>Mon, 23 Sep 2019 00:00:00 GMT</pubDate>
    <author> ()</author>
    <category>ML</category><category>Deep Learning</category>
  </item>

  <item>
    <guid>https://markokello.com/blog/probability-distributions</guid>
    <title>A Sample of Probability Distributions and Their Properties</title>
    <link>https://markokello.com/blog/probability-distributions</link>
    
    <pubDate>Fri, 15 May 2020 00:00:00 GMT</pubDate>
    <author> ()</author>
    <category>ML</category><category>Statistics</category><category>Maths</category><category>Deep Learning</category>
  </item>

    </channel>
  </rss>
