Book summary "Warren Buffett and the Interpretation of Financial Statements"

Warren realized that if a company’s competitive advantage could be maintained for a long period of time—if it was “durable”—then the underlying value of the business would continue to increase year after year. Warren likes to think of these companies as owning a piece of the consumer’s mind, and when a company owns a piece of the consumer’s mind, it never has to change its products, which, as you will find out, is a good thing.

Kubernetes rolling updates

Note: this tutorial is done with Kubernetes v1.17.2, Docker 19.03.5 on MacOS 0.15.2. Kubernetes Kubernetes cluster A node is a virtual machine or a physical computer that serves as a worker machine in a Kubernetes cluster. The Master is responsible for managing the cluster. The master coordinates all activities in your cluster, such as scheduling applications, maintaining applications’ desired state, scaling applications, and rolling out new updates. Kubernetes core concepts

Target encoding

The common approaches to encode categorical data are one-hot encoding and label encoding. Recently I encountered another method called “target encoding” which is more efficient. This technique is invented by Daniele Micci-Barreca (A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems). The idea is using the relationship between the categorical feature x and the target y in order to have a more meaningful numerical representation of x. If x has N unique values xithen this relationship is defined as a function of the count of xi(how many times that we observe x equals to xi) and the mean y corresponding to each xi.

Spacy pretrain command

Spacy 2.1 released an interesting command spacy pretrain. It loads a pre-trained vectors (https://spacy.io/models/) and uses a CNN model to predict each word’s pre-trained vector instead of the word itself. They termed this technique as Language Modelling with Approximate Outputs (LMAO). According to the creator of Spacy, this approach is especially useful when you have limited training data for text classification and parsing task. He used pretraining’s output to train a text classifier on 1,000 samples and reported a high F1-score of 87% on the test set consisting of 5,000 samples.