在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):ShuaiW/ml-interview开源软件地址(OpenSource Url):https://github.com/ShuaiW/ml-interview开源编程语言(OpenSource Language):Jupyter Notebook 100.0%开源软件介绍(OpenSource Introduction):Data Science Question AnswerThis repo is deprecated. Please go to the new repo,This repository covers how to prepare for machine learning interviews, mainly in the format of questions & answers. Asides from machine learning knowledge, other crucial aspects include: Go directly to machine learning Explain your resumeYour resume should specify interesting ML projects you got involved in the past, and quantitatively show your contribution. Consider the following comparison:
vs.
We all can tell which one is gonna catch interviewer's eyeballs and better show case your ability. In the interview, be sure to explain what you've done well. Spend some time going over your resume before the interview. SQLAlthough you don't have to be a SQL expert for most machine learning positions, the interviews might ask you some SQL related questions so it helps to refresh your memory beforehand. Some good SQL resources are: Machine learningFirst, it's always a good idea to review Chapter 5 of the deep learning book, which covers machine learning basics.
Linear regression
Logistic regression
KNNGiven a data point, we compute the K nearest data points (neighbors) using certain distance metric (e.g., Euclidean metric). For classification, we take the majority label of neighbors; for regression, we take the mean of the label values. Note for KNN technically we don't need to train a model, we simply compute during inference time. This can be computationally expensive since each of the test example need to be compared with every training example to see how close they are. There are approximation methods can have faster inference time by partitioning the training data into regions. Note when K equals 1 or other small number the model is prone to overfitting (high variance), while when K equals number of data points or other large number the model is prone to underfitting (high bias) SVM
Decision tree
BaggingTo address overfitting, we can use an ensemble method called bagging (bootstrap aggregating), which reduces the variance of the meta learning algorithm. Bagging can be applied to decision tree or other algorithms. Here is a great illustration of a single estimator vs. bagging
Random forestRandom forest improves bagging further by adding some randomness. In random forest, only a subset of features are selected at random to construct a tree (while often not subsample instances). The benefit is that random forest decorrelates the trees. For example, suppose we have a dataset. There is one very predicative feature, and a couple of moderately predicative features. In bagging trees, most of the trees will use this very predicative feature in the top split, and therefore making most of the trees look similar, and highly correlated. Averaging many highly correlated results won't lead to a large reduction in variance compared with uncorrelated results. In random forest for each split we only consider a subset of the features and therefore reduce the variance even further by introducing more uncorrelated trees. I wrote a notebook to illustrate this point. In practice, tuning random forest entails having a large number of trees (the more the better, but
always consider computation constraint). Also, Feature importance In a decision tree, important features are likely to appear closer to the root of the tree. We can get a feature's importance for random forest by computing the averaging depth at which it appears across all trees in the forest. BoostingHow it works Boosting builds on weak learners, and in an iterative fashion. In each iteration, a new learner is added, while all existing learners are kept unchanged. All learners are weighted based on their performance (e.g., accuracy), and after a weak learner is added, the data are re-weighted: examples that are misclassified gain more weights, while examples that are correctly classified lose weights. Thus, future weak learners focus more on examples that previous weak learners misclassified. Difference from random forest (RF)
XGBoost (Extreme Gradient Boosting)
Stacking
MLPA feedforward neural network where we have multiple layers. In each layer we can have multiple neurons, and each of the neuron in the next layer is a linear/nonlinear combination of the all the neurons in the previous layer. In order to train the network we back propagate the errors layer by layer. In theory MLP can approximate any functions. CNNThe Conv layer is the building block of a Convolutional Network. The Conv layer consists of a set of learnable filters (such as 5 * 5 * 3, width * height * depth). During the forward pass, we slide (or more precisely, convolve) the filter across the input and compute the dot product. Learning again happens when the network back propagate the error layer by layer. Initial layers capture low-level features such as angle and edges, while later layers learn a combination of the low-level features and in the previous layers and can therefore represent higher level feature, such as shape and object parts. RNN and LSTMRNN is another paradigm of neural network where we have difference layers of cells, and each cell only take as input the cell from the previous layer, but also the previous cell within the same layer. This gives RNN the power to model sequence. This seems great, but in practice RNN barely works due to exploding/vanishing gradient, which is cause by a series of multiplication of the same matrix. To solve this, we can use a variation of RNN, called long short-term memory (LSTM), which is capable of learning long-term dependencies. The math behind LSTM can be pretty complicated, but intuitively LSTM introduce - input gate - output gate - forget gate - memory cell (internal state) LSTM resembles human memory: it forgets old stuff (old internal state * forget gate) and learns from new input (input node * input gate) word2vec
Generative vs discriminative
Paramteric vs Nonparametric
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论