XGBoost: Quick Reference

February 25, 2019

In the realm of machine learning algorithms, XGBoost stands out as a powerful tool for predictive modeling. Its efficiency, flexibility, and accuracy have made it a favorite among data scientists and machine learning practitioners. This note delves into what XGBoost is, how it works, and demonstrates its implementation using Python with illustrative examples.

Clinical Research Abbreviations: The Cheatsheet

by Mirzo Kanoatov

July 9, 2018

Header image for the post titled Clinical Research Abbreviations: The Cheatsheet

When you step into the labyrinth of clinical research, you find yourself surrounded by hedge walls made of acronyms and abbreviations. The clinical world is laden with shorthand that can seem like a secret language to the uninitiated. Yet, behind these cryptic codes lie vital information crucial for patient care. This living note is my personal attempt to keep the alphabet soup of healthcare terminology under control and to unlock the secrets hidden within.

Git: setting up on Windows

by Mirzo Kanoatov

July 28, 2017

Header image for the post titled Git: setting up on Windows

My go-to notes for whenever I have to re-setup git and github on Windows.

Principal Component Analysis

by Mirzo Kanoatov

April 29, 2017

Header image for the post titled Principal Component Analysis

The “curse of dimensionality” refers to various challenges that arise as the number of dimensions in a dataset increases, such as exponential growth in data space, sparsity of data, loss of meaningful distance metrics, increased computational complexity, and higher risk of overfitting. Principal Component Analysis (PCA) is a powerful tool to combat these challenges by reducing the dimensionality of data while retaining most of the original variance. This note explores PCA in depth, including its mathematical foundation, applications, and practical considerations.

Covariance

by Mirzo Kanoatov

April 26, 2017

Header image for the post titled Covariance

Covariance is a statistical measure that quantifies the degree to which two variables change together. It’s a key measure used to understand the linear relationship between variables.

Natural Language Processing: A Primer

by Mirzo Kanoatov

September 23, 2014

Header image for the post titled Natural Language Processing: A Primer

I’ve been fascinated by the possibility of extracting knowledge from large bodies of text using computational methods since… well, since I’ve started reading scientific literature. Natural Language Processing (NLP) is a branch of machine learning that focuses on the interaction between computers and humans through natural language. The ultimate objective of NLP is to enable computers to understand, interpret, and generate human language in a way that is both meaningful and useful. This note will give an overview of the basic NLP concepts and methods, and will give practical examples using Python.