# Computational Data Analysis in Chemistry

Humans in general have a fairly poor intuition of probability. As anyone who has been subjected to some form of news feed knows, confirmation bias is also a major problem in our everyday lives. This issue only intensifies in science. This course will cover both probability theory, statistics and how to use software to figure this out. Statistical analysis is a fundamental component in every scientist’s toolkit, both to be able to faithfully represent the results from their own experiments, as well as to be able to understand and evaluate the significance of results presented by both their colleagues and in the scientific literature.

This course is divided in two parts. The first one "Introduction to Matlab" is the more pragmatic one, and we’ll spend the majority of the time in the computer lab, learning the basics of programming as a tool to process and plot experimental results.

The second part "Probability theory and statistics" is more theoretical, though it also includes two computer exercise sessions.

• ## Course structure

### Modules

Part 1 - Introduction to Matlab

The aim of the first part of the course is for the student to learn how to program in Matlab in order to process and perform basic statistical analysis of experimental data. This includes performing mathematical operations on data sets, fitting data to mathematical models, and producing publication quality plots.

No previous programming experience is assumed. As such, the major challenge in this part of the course is for the student to learn how to think at the computer programming level of abstraction: how to structure the solution of a problem as a sequence of fundamental tasks which translate to programming instructions. This is only achieved through practice.

Part 2 - Probability theory and statistics

Humans in general have a fairly poor intuition of probability. As anyone who has been subjected to some form of news feed knows, confirmation bias is also a major problem in our everyday lives. This issue only intensifies in science.

The second part of the course aims to provide students with the understanding of the fundamental role of randomness in all experimental data, as well as the basics of probability theory, descriptive statistics, and statistical inference. This should allow the student not only to correctly present their results, but to understand their significance and strengths. This will also permit the student to improve their critical thinking regarding the discussion of any quantitative data.

### Teaching format

Part 1 - Introduction to Matlab, 4 hp

As such this part of the course consists of 8 short lectures immediately followed by longer computer exercise sessions, where we progressively get familiar with Matlab, first as a fancy calculator, capable of sophisticated mathematical calculations, and then as a versatile and powerful programming language. Learning the basic skills of programming should prove extremely valuable for the student, regardless of whether they pursue a career in academia or industry.

Part 2 - Probability theory and statistics, 3.5 hp

This part of the course consists of seven lectures, each accompanied by an exercise session. There will be also two computer exercises, where we’ll introduce statisticical programming.

### Assessment

Part 1 - Introduction to Matlab, 4 hp

The evaluation of Part 1 consists of two computer labs, where we provide actual experimental data for the student to analyze and discuss. This will require the student to leverage the skills practiced during the exercise sessions, as well as some knowledge of chemistry acquired during the first year of the program.

Part 2 - Probability theory and statistics, 3.5 hp

The evaluation of Part 2 consists of a written exam.

#### Examiner

Mats Johnsson
mats.johnsson@mmk.su.se

• ## Schedule

The schedule will be available no later than one month before the start of the course. We do not recommend print-outs as changes can occur. At the start of the course, your department will advise where you can find your schedule during the course.
• ## Course literature

Note that the course literature can be changed up to two months before the start of the course.

Part 1 - Introduction to Matlab

B. Hahn and D. Valentine. Essential MATLAB for Engineers and Scientists, 7th Edition. Academic Press, 2019

Part 2 - Probability theory and statistics

Probability & Statistics for Engineers & Scientists, ISBN: 9781292161365

• ## Contact

Coordinator for Part 1: Introduction to Matlab
Coordinator for Part 2: Probability theory and statistics
Chemistry Section & Student Affairs Office