Matthijs van Leeuwen
Associate professor/Director of Education
- Name
- Dr. M. van Leeuwen
- Telephone
- +31 71 527 7048
- m.van.leeuwen@liacs.leidenuniv.nl
- ORCID iD
- 0000-0002-0510-3549

Matthijs likes data, patterns, algorithms, and information theory. He strives for data mining and machine learning methods and results that are principled, interpretable, and incorporate existing knowledge. He is director of education of the Computer Science, Media Technology and ICT in Business and the Public Sector master's programmes. Besides this, he is member of the LIACS management team and of the interdisciplinary research programme Society, Artificial Intelligence and Life Sciences (SAILS).
More information about Matthijs van Leeuwen
PhD Candidates
News
Matthijs is assistant professor, group leader of the Explanatory Data Analysis group, Programme Manager of the Master Computer Science and member of the interdisciplinary research programme Society, Artificial Intelligence and Life Sciences (SAILS). His primary research interest is exploratory data mining: how can we enable domain experts to explore and analyse their data, to discover structure and—ultimately—novel knowledge?
For this it is important that methods and results are explainable to domain experts, who may not be data scientists. His approach is to define and identify patterns that matter, i.e., succinct descriptions that characterise relevant structure present in the data. Which patterns matter strongly depends on the data and task at hand, hence defining the problem is one of the key challenges. Information theoretic concepts such as the minimum description length (MDL) principle have proven very useful to this end. Matthijs is also interested in interactive data mining, i.e., involving humans in the loop. Finally, he is interested in fundamental data mining research for real-world applications, both in science (e.g., life sciences, social sciences) and industry (e.g., manufacturing and engineering, aviation), as this is the best way to show that the theory works in practice.
Short bio
Matthijs was previously a (tenure track) assistant professor (2017-2020) and senior researcher (2015-2017) at Leiden University, and a postdoctoral researcher at KU Leuven (2011-2015) and Universiteit Utrecht (2009-2011). He defended his Ph.D. thesis, titled Patterns that Matter, in February 2010, at Universiteit Utrecht. He won several best paper awards at international conferences and was awarded NWO Rubicon, FWO Postdoc, and NWO TOP2 grants. He is General Chair of the IDA Council and editorial board member of Data Mining and Knowledge Discovery. Further, he co-organised a number of international conferences and workshops, and co-lectured tutorials on 'Information Theoretic Methods in Data Mining'.
Associate professor/Director of Education
- Science
- Leiden Inst of Advanced Computer Science
- Li Z., Liang S., Shi J. & Leeuwen M. van (2024), Cross-domain graph level anomaly detection, IEEE Transactions on Knowledge and Data Engineering 36(12): 7839-7850.
- Li Z., Zhu Y. & Leeuwen M. van (2023), A survey on explainable anomaly detection, ACM Transactions on Knowledge Discovery from Data 18(1): 23.
- Li Z. & Leeuwen M. van (2023), Explainable contextual anomaly detection using quantile regression forests, Data Mining and Knowledge Discovery 37: 2517-2563.
- Dijk M.K. van, Gawehns D. & Leeuwen M. van (2023), WEARDA: recording wearable sensor data for human activity monitoring, Journal of Open Research Software 11(1): 13.
- Arend B.W.H. van der, Verhagen I.E., Leeuwen M. van, Arend M.Q.T.P. van der, Casteren D.S. van & Terwindt G.M. (2023), Defining migraine days, based on longitudinal E-diary data, Cephalalgia 43(5): .
- Papagioanni I. & Leeuwen M. van (2023), Discovering rule lists with preferred variables. Crémilleux B., Hess S. & Nijssen S. (Eds.), Advances in intelligent data analysis XXI. IDA 2023. 21st International Symposium on Intelligent Data Analysis, IDA 2023 12 April 2023 - 14 April 2023. Lecture Notes in Computer Science no. 13876. Cham: Springer. 340-352.
- Kroes S.K.S., Leeuwen M. van, Groenwold R.H.H. & Janssen M.P. (2023), Evaluating cluster-based synthetic data generation for blood-transfusion analysis, Journal of Cybersecurity and Privacy 3(4): 882-894.
- Lopez-Martinez-Carrasco A., Proença M.H., Juarez M.J., Leeuwen M. van & Campos M. (2023), Novel approach for phenotyping based on diverse Top-K subgroup lists. Juarez J.M., Marcos M., Stiglic G. & Tuckker A. (Eds.), Artificial Intelligence in Medicine (AIME 2023). 21st International Conference on Artificial Intelligence in Medicine, AIME 2023 12 June 2023 - 15 June 2023. Lecture Notes in Computer Science no. 13897. Cham: Springer. 45-50.
- Vinkenoog M., Toivoren J., Leeuwen M. van, Janssen M.P. & Arvas M. (2023), The added value of ferritin levels and genetic markers for the prediction of haemoglobin deferral, Vox Sanguinis 118(10): 825-834.
- Yang L. & Leeuwen M. van (2023), Truly unordered probabilistic rule sets for multi-class classification. Amini M.R., Canu S., Fischer A., Guns T., Kralj N. & Tsoumaka G. (Eds.), Machine learning and knowledge discovery in databases: ECML PKDD 2022. European Conference, ECML PKDD 2022 19 September 2022 - 23 September 2022. Lecture Notes in Computer Science no. 13717. Cham: Springer. 87-103.
- Yang L., Baratchi M. & Leeuwen M. van (2023), Unsupervised discretization by two-dimensional MDL-based histogram, Machine Learning 112: 2397-2431.
- Rijn S.J. van, Schmitt S., Leeuwen M. van & Bäck T.H.W. (2023), Finding efficient trade-offs in multi-fidelity response surface modelling, Engineering Optimization 55(6): 946-963.
- Kroes S.K.S., Leeuwen M. van, Groenwold R.H.H. & Janssen M.P. (2023), Generating synthetic mixed discrete-continuous health records with mixed sum-product networks, Journal of the American Medical Informatics Association 30(1): 16-25.
- Vinkenoog M., Steenhuis M., Brinke A. ten, Hasselt J.G.C. van: Janssen M.P., Leeuwen M. van, Swaneveld F.H., Vrielink H., Watering L. van de, Quee F., Hurk K. van den, Rispens T., Hogema B. & Schoot C.E. van der (2022), Associations between symptoms, donor characteristics and IgG antibody response in 2082 COVID-19 convalescent plasma donors, Frontiers in Immunology 13: 821721.
- Manuel Proenca H., Grünwald P.D., Bäck T.H.W. & Leeuwen M. van (2022), Robust subgroup discovery: discovering subgroup lists using MDL, Data Mining and Knowledge Discovery 36(5): 1885-1970.
- Vinkenoog M., Leeuwen M. van & Janssen M.P. (2022), Explainable haemoglobin deferral predictions using machine learning models: interpretation and consequences for the blood supply, Vox Sanguinis 117(11): 1262-1270.
- Zhong L., Leeuwen M van & Li Z. (2022), Feature selection for fault detection and prediction based on event log analysis, ACM SIGKDD Explorations 24(2): 96-104.
- Manuel Proença H., Grünwald P.D., Bäck T.H.W. & Leeuwen M. van (2021), Discovering outstanding subgroup lists for numeric targets using MDL. Hutter F., Kersting K., Lijffijt J. & Valera I. (Eds.), Machine learning and knowledge discovery in databases. ECML PKDD 2020 14 September 2020 - 18 September 2020 no. 12457. Cham: Springer . 19-35.
- Marx A., Yang L. & Leeuwen M. van (2021), Estimating conditional mutual information for discrete-continuous mixtures using multi-dimensional adaptive histograms. Demeniconi C. & Davidson I. (Eds.), Proceedings of the 2021 SIAM International Conference on Data Mining (SDM). 2021 SIAM International Conference on Data Mining (SDM) 29 April 2021 - 1 May 2021: SIAM. 387-395.
- Kroes S.K., Janssen M.P., Groenwold R.H. & Leeuwen M. van (2021), Evaluating privacy of individuals in medical data, Health Informatics Journal 27(2): .
- Kapoor S., Saxena D.K. & Leeuwen M. van (2021), Online summarization of dynamic graphs using subjective interestingness for sequential data, Data Mining and Knowledge Discovery 35(1): 88-126.
- Vinkenoog M. Hurk K. van den Kraaij M. van Leeuwen M. van Janssen M.P. (2020), First results of a ferritin‐based blood donor deferral policy in the Netherlands, Transfusion 60(8): 1785-1792.
- Manuel Proença H. & Leeuwen M. van (2020), Interpretable multiclass classification by MDL-based rule lists, Information Sciences 512: 1372-1393.
- Kapoor S., Saxena D.K. & Leeuwen M. van (2020), Discovering subjectively interesting multigraph patterns, Machine Learning 109(8): 1669-1696.
- Gautrais C., Cellier P., Leeuwen M. van & Termier A. (2020), Widening for MDL-Based Retail Signature Discovery. Berthold M.R., Feelders A. & Krempl G. (Eds.), Advances in intelligent data analysis XVIII. IDA 2020. International Symposium on Intelligent Data Analysis (IDA 2020) 27 April 2020 - 29 April 2020 no. 12080. Cham: Springer. 197-209.
- Faas M. & Leeuwen M. van (2020), Vouw: geometric pattern mining using the MDL principle. Berthold M., Feelders A. & Krempl G. (Eds.), Advances in intelligent data analysis XVIII. IDA 2020. International Symposium on Intelligent Data Analysis (IDA 2020) 27 April 2020 - 29 April 2020 no. 12080. Cham: Springer . 158-170.
- Gawehns D., Veiga G. & Leeuwen M. van (2019), Focus on Dynamics: a proof of principle in exploratory data mining of face-to-face interactions. 5th International Conference on Computational Social Sciences, Amsterdam. 17 July 2018 - 20 July 2019. [conference poster].
- Vinkenoog M., Janssen M. & Leeuwen M. van (2019), Challenges and Limitations in Clustering Blood Donor Hemoglobin Trajectories. Lemaire V., Malinowski S., Bagnall A, Bondu A., Guyet T. & Tavanard R. (Eds.), Advanced Analytics and Learning on Temporal Data. AALTD 2019. International Workshop on Advanced Analysis and Learning on Temporal Data (AALTD 2019) 20 September 2019 - 20 September 2019. Lecture Notes in Computer Science no. 11986. Cham: Springer International Publishing. 72-84.
- Leeuwen M. van, Chau D.H., Vreeken J., Shahaf D. & Faloutsos C. (2019), Addendum to the Special Issue on Interactive Data Exploration and Analytics (TKDD, Vol. 12, Iss. 1): Introduction by the Guest Editors. [other].
- Manuel Proença H., Klijn R., Bäck T.H.W. & Leeuwen M. van (2019), Identifying flight delay patterns using diverse subgroup discovery, Proceedings of the Symposium Series on Computational Intelligence (SSCI'18). 2018 Symposium Series on Computational Intelligence 18 November 2018 - 21 November 2018. Bangalore, India: IEEE. 60-67.
- Rijn S.J. van, Schmitt S., Olhofer M., Leeuwen M. van & Bäck T. (2018), Multi-Fidelity Surrogate Model Approach to Optimization. Aguirre H. (Ed.), GECCO'18 Proceedings of the Genetic and Evolutionary Computation Conference Companion. GECCO 2018 15 July 2018 - 19 July 2018. New York: ACM. 225-226.
- Leeuwen M. van, Chau P., Vreeken J., Shahaf D. & Faloutsos C. (Eds.) (2018), Editorial: TKDD Special Issue on Interactive Data Exploration and Analytics. ACM Transactions on Knowledge Discovery from Data: ACM.
- Os H.J.A. van, Ramos L.A., Hilbert A., Leeuwen M. van, Walderveen M.A.A. van, Kruyt N.D., Dippel D.W.J., Steyerberg E.W., Schaaf I.C. van der, Lingsma H.F., Schonewille W.J., Majoie C.B.L.M., Olabarriaga S.D., Zwinderman K.H., Venema E., Marquering H.A. & Wermer M.J.H. (2018), Predicting Outcome of Endovascular Treatment for Acute Ischemic Stroke: Potential Value of Machine Learning Algorithms, Frontiers in Neurology 9: 784.
- Stein B. van, Leeuwen M. van & Bäck T. (2017), Local Subspace-Based Outlier Detection using Global Neighbourhoods, 2016 IEEE International Conference on Big Data (Big Data). : IEEE. 1136-1142.
- Stein B. van, Leeuwen M. van, Wang H., Purr S., Kreissl S., Meinhardt J. & Bäck T.H.W. (2017), Towards Data Driven Process Control in Manufacturing Car Body Parts, 2016 International Conference on Computational Science and Computational Intelligence CSCI. International Conference on Computational Science and Computational Intelligence (CSCI 2016) 15 December 2016 - 17 December 2016: IEEE CPS.
- Dzyuba V., Leeuwen M. van & De Raedt L. (2017), Flexible constrained sampling with guarantees for pattern mining, Data Mining and Knowledge Discovery 31(5): 1266–1293.
- Le T. van, Nijssen S., Leeuwen M. van & De Raedt L. (2017), Semiring Rank Matrix Factorisation, IEEE Transactions on Knowledge and Data Engineering 29(8): 1737-1750.
- Paramonov S., Leeuwen M. van & Raedt L. de (2017), Relational data factorization, Machine Learning 106(12): 1867-1904.
- Dzyuba V. & Leeuwen M. van (2017), Learning what matters - Sampling interesting patterns. Ceci M., Hollmén J. & Todorovski L. (Eds.), Machine Learning and Knowledge Discovery in Databases. ECMLPKDD 18 September 2017 - 22 September 2017 no. Lecture Notes in Computer Science vol. 10535. Cham: Springer. 425-441.
- Dzyuba V. & Leeuwen M. van (2017), Learning what matters – Sampling interesting patterns. Kim J., Shim K., Cao L., Lee J.G., Lin X. & Moon Y.S. (Eds.), Advances in Knowledge Discovery and Data Mining. Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'17) 23 May 2017 - 26 May 2017 no. Lecture Notes in Computer Science vol. 10234. Cham: Springer. 534-546.
- Stein B. van, Leeuwen M. van & Bäck T.H.W. (2016), Local Subspace-Based Outlier Detection using Global Neighbourhoods, 2016 IEEE International Conference on Big Data (Big Data). IEEE International Conference on Big Data 2016 5 December 2016 - 8 December 2016: IEEE Publishing.
- Rijn Sander van, Hao Wang, Leeuwen M. van & Bäck T.H.W. (2016), Evolving the Structure of Evolution Strategies, 2016 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE SSCI 2016 6 December 2016 - 9 December 2016: IEEE Publishing. 1-8.
- Leeuwen M. van, Bie T. de, Spyropoulou E. & Mesnage C. (2016), Subjective interestingness of subgraph patterns, Machine Learning 105(1): 41-75.
- Le T. van, Leeuwen M. van, Fierro A.C., Maeyer D. de, Van den Eynden J., Verbeke .L., De Raedt L., Marchal K. & Nijssen S.G.R. (2016), Simultaneous discovery of cancer subtypes and subtype features by molecular data integration, Bioinformatics 32(17): i445--i454.
- Copmans D., Meinl T., Dietz C., Leeuwen M. van, Ortmann J., Berthold M.R. & Witte P.A. de (2016), A KNIME-Based Analysis of the Zebrafish Photomotor Response Clusters the Phenotypes of 14 Classes of Neuroactive Molecules, Journal of biomolecular screening 21(5): 427-436.
- Van T. le, Leeuwen M. van, Nijssen S.G.R. & Raedt L. de (2015), Rank Matrix Factorisation. Cao T., Lim E.P., Zhou Z.H., Ho T.B., Cheung D. & Motoda H. (Eds.), Proceedings Advances in Knowledge Discovery and Data Mining. Advances in Knowledge Discovery and Data Mining - 19th Pacific-Asia Conference, PAKDD 2015 19 May 2015 - 22 May 2015 no. LNCS 9077. Cham: Springer. 734-746.
- Leeuwen M. van & Cardinaels L. (2015), VIPER - Visual Pattern Explorer. Bifet A., May M., Zadrozny B., Gavalda R., Pedreschi D., Bonchi F., Cardoso J. & Spiliopoulou M. (Eds.), ECML PKDD: Machine Learning and Knowledge Discovery in Databases. ECMLPKDD 7 September 2015 - 11 September 2015 no. 9286. Cham: Springer. 333-336.
- Fromont E., Bie T. de & Leeuwen M. van (Eds.) (2015), Advances in Intelligent Data Analysis XIV. Lecture Notes in Computer Science no. 9385. Cham: Springer.
- Paramonov S., Leeuwen M. van, Denecker M. & Raedt L. de (2015), An exercise in declarative modeling for relational query mining. Inoue K., Ohwada H. & Yamamoto A. (Eds.), Inductive Logic Programming. ILP 2015. 25th International Conference, ILP 2015 20 August 2015 - 22 August 2015 no. LNCS 9575. Cham: Springer. 166-182.
- Leeuwen M. van & Ukkonen A. (2015), Same bang, fewer bucks: efficient discovery of the cost-influence skyline. Venkatasubramanian S. & Ye J. (Eds.), Proceedings of the 2015 SIAM International Conference on Data Mining. 2015 SIAM International Confernce on Data Mining 30 April 2015 - 2 May 2015: SIAM. 19-27.
- Chau P., Vreeken J., Van Leeuwen M. & Faloutsos C. (2015), Proceedings of the ACM SIGKDD 2015 Full-day Workshop on Interactive Data Exploration and Analytics. [other].
- Aksehirli E., Nijssen S.G.R., Leeuwen M. van & Goethals B. (2015), Finding subspace clusters using ranked neighborhoods, 2015 IEEE International Conference on Data Mining Workshop (ICDMW). The 3rd International Workshop on High Dimensional Data Mining 14 November 2015 - 14 November 2015: IEEE Publishing. 831-838.
- Leeuwen M. van & Galbrun E. (2015), Association Discovery in Two-View Data, IEEE Transactions on Knowledge and Data Engineering 27(12): 3190-3202.
- Chau P., Vreeken J., Leeuwen M. van, Shahaf D. & Faloutsos C. (Eds.) (2013), IDEA '13 Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics: ACM.