European IT professional; experienced project manager and programmer with skills covering data science and DataOps with Python and R. I am interested in clean code and in software quality in general.
eric | Sept. 24, 2024, 10:51 a.m.
In R programming, both `recode()` and `case_match()` are used to replace or reassign values in a vector. However, the function `recode()` is considered superseded in favor of the more general `case_ma...
Read More →eric | Sept. 13, 2024, 5:51 p.m.
Data cleaning is a crucial first step in any data analysis or machine learning pipeline. Without clean and properly structured data, the results of any analysis can be misleading or invalid. In this g...
Read More →eric | May 16, 2023, 9:09 a.m.
A basic Apache Kafka test-setup with 2 servers using KRaft. The recommended setup for production is at least 3 brokers and 3 controllers. The procedure is the same as below. Just add one more broker. ...
Read More →eric | Jan. 3, 2023, 5:25 p.m.
There are a few posts with titles like "My First 5 Minutes With A Server", perhaps even 10 or 15 minutes. Most are helpful for a minimal setup, but I can't remember spending less than 60...
Read More →eric | Nov. 16, 2021, 3:31 p.m.
This is a quick guide to getting started with Promtail for Loki. Promtail is Grafana's native solution for getting logs into Loki and, as you should expect, is nicely integrated with it....
Read More →eric | May 7, 2021, 4:47 p.m.
Dealing with different file encodings for a set of data can be a bit of a pain [1], but there is one tool that is really useful in this situation. Using the readr-package[2] with its guess_encoding-fu...
Read More →eric | May 7, 2021, 4:19 p.m.
Literate statistical programming can be a useful way to put text, code, data, output all in one document. If you have a Linux desktop, and the distribution includes proper support for R, chances are y...
Read More →eric | May 6, 2021, 11:45 a.m.
Rpubs is a free service for the publication of Html-reports from R. Typically, the service is used to publish research. If you are working with R from the command line, there is a method for publishin...
Read More →eric | Feb. 25, 2020, 5 p.m.
You can add editing functionality to the Django administration backend by adding an editor to all or to selected forms in the backend. Here is a couple of fast-track how-tos on how to set this up with...
Read More →eric | Feb. 25, 2020, 4:08 p.m.
Git puts the upper file size limit at 100 MB. So what can you do if you have files bigger than that? Git Large File Storage (Git LFS) is an open-source extension to Git that allows you to work with la...
Read More →eric | Feb. 25, 2020, 3:32 p.m.
Comments to the book "Learn You Some Erlang For Great Good" by Fred Hébert, Chapter 4 - Types (or lack thereof). Pattern matching is much easier to achieve in Erlang than in many other languages, ...
Read More →eric | Jan. 21, 2020, 6:10 p.m.
Order, Order, Order! If you have done any major R-project, you quickly get to the point where it is hard to keep everything ordered - your scripts, your data, your output, your tests... If you have...
Read More →eric | Jan. 6, 2020, 8:45 p.m.
Comments to the book "Learn You Some Erlang For Great Good" by Fred Hébert, Chapter 3 - Syntax in Functions. Pattern matching is much easier to achieve in Erlang than in many other la...
Read More →eric | Nov. 3, 2019, 10:30 a.m.
A guide on how to install and operate a two-host XenServer pool with manual fail-over. This is the first part of a series of articles on a small-footprint XenServer setup with no single points of fail...
Read More →eric | July 1, 2019, 6:59 p.m.
Comments to the book "Learn You Some Erlang For Great Good" by Fred Hébert, Chapter 2 - Modules. Understanding how modules work and how they should be organized. Learning goals: How to create a modu...
Read More →eric | May 1, 2019, 3:59 p.m.
Comments to the book "Learn You Some Erlang For Great Good" by Fred Hébert, Chapter 1 - Starting Out. Challenges: Remembering relevant shell commands and understanding immutability. I...
Read More →eric | March 6, 2019, 9:20 a.m.
I am trying to set up a small footprint XenServer lab environment that does not have any single point of failure. That means two hosts in a pool, two NICs on every host and SR (storage repository), mu...
Read More →Experienced dev and PM. Data science, DataOps, Python and R. DevOps, Linux, clean code and agile. 10+ years working remotely. Polyglot. Startup experience.
LinkedIn Profile
Statistics & R - a blog about - you guessed it - statistics and the R programming language.
R-blog
Erlang Explained - a blog on the marvelllous programming language Erlang.
Erlang Explained