HarvardX: Principles, Statistical and Computational Tools for Reproducible Data Science

4.2 stars

12 ratings

Learn skills and tools that support data science and reproducible research, to ensure you can trust your own research results, reproduce them yourself, and communicate them to others.

8 semanas

3–8 horas por semana

A tu ritmo

Avanza a tu ritmo

Gratis

Verificación opcional disponible

Hay una sesión disponible:

¡Ya se inscribieron 110,024! Una vez finalizada la sesión del curso, será archivado.

Comienza el 11 abr

Termina el 10 dic

Inscríbete ahora

Me gustaría recibir correos electrónicos de HarvardX e informarme sobre otras ofertas relacionadas con Principles, Statistical and Computational Tools for Reproducible Data Science.

Inscríbete ahora

Comienza el 11 abr

Sobre este curso

Omitir Sobre este curso

Today the principles and techniques of reproducible research are more important than ever, across diverse disciplines from astrophysics to political science. No one wants to do research that can’t be reproduced. Thus, this course is really for anyone who is doing any data intensive research. While many of us come from a biomedical background, this course is for a broad audience of data scientists.

To meet the needs of the scientific community, this course will examine the fundamentals of methods and tools for reproducible research. Led by experienced faculty from the Harvard T.H. Chan School of Public Health, you will participate in six modules that will include several case studies that illustrate the significant impact of reproducible research methods on scientific discovery.

This course will appeal to students and professionals in biostatistics, computational biology, bioinformatics, and data science. The course content will blend video lectures, case studies, peer-to-peer engagements and use of computational tools and platforms (such as R/RStudio, and Git/Github), culminating in a final presentation of a final reproducible research project.

We’ll cover Fundamentals of Reproducible Science; Case Studies; Data Provenance; Statistical Methods for Reproducible Science; Computational Tools for Reproducible Science; and Reproducible Reporting Science. These concepts are intended to translate to fields throughout the data sciences: physical and life sciences, applied mathematics and statistics, and computing.

Consider this course a survey of best practices: we’d like to make you aware of pitfalls in reproducible data science, some failure - and success - stories in the past, and tools and design patterns that might help make it all easier. But ultimately it’ll be up to you to take the skills you learn from this course to create your own environment in which you can easily carry out reproducible research, and to encourage and integrate with similar environments for your collaborators and colleagues. We look forward to seeing you in this course and the research you do in the future!

De un vistazo

Institution HarvardX
Subject Análisis de datos
Level Intermediate
Prerequisites
- Basic knowledge of Rand Git
- A computer that is capable of downloading software to run on it.

Language English
Video Transcripts اَلْعَرَبِيَّةُ, Deutsch, English, Español, Français, हिन्दी, Bahasa Indonesia, Português, Kiswahili, తెలుగు, Türkçe, 中文
Associated skillsPublic Health, Statistics, Software Design Patterns, R (Programming Language), Astrophysics, Biostatistics, RStudio, Political Sciences, Git (Version Control System), Research Methodologies, Life Sciences, Computational Tools, Github, Research, Statistical Methods, Presentations, Computational Biology, Applied Mathematics, Bioinformatics, Data Science

Lo que aprenderás

Omitir Lo que aprenderás

Understand a series of concepts, thought patterns, analysis paradigms, and computational and statistical tools, that together support data science and reproducible research.
Fundamentals of reproducible science using case studies that illustrate various practices
Key elements for ensuring data provenance and reproducible experimental design
Statistical methods for reproducible data analysis
Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.
How to develop new methods and tools for reproducible research and reporting
How to write your own reproducible paper.

Plan de estudios

Omitir Plan de estudios

Module 1: Introduction to Reproducible Science

Module 2: Fundamentals of Reproducible Science

Definitions and Concepts
Factors affecting reproducibility

Module 3: Case Studies in Reproducible Research

Module 4: Data Provenance

Project Design
Journal Requirements
Repositories
Privacy and Security

Module 5: Computational Tools for Reproducible Science

R and Rstudio
Python, Git, and GitHub
Creating a repository
Data sources
Dynamic report generation
Workflows

Module 6: A optional deeper dive into Statistical Methods for Reproducible Science

Prediction Models
Coefficient of determination
Brier score
Area Under the Curve (AUC)
Concordance in survival analysis
Cross-validation
Bootstrap
Simulations
Clustering

¿Quién puede hacer este curso?

Lamentablemente, las personas residentes en uno o más de los siguientes países o regiones no podrán registrarse para este curso: Irán, Cuba y la región de Crimea en Ucrania. Si bien edX consiguió licencias de la Oficina de Control de Activos Extranjeros de los EE. UU. (U.S. Office of Foreign Assets Control, OFAC) para ofrecer nuestros cursos a personas en estos países y regiones, las licencias que hemos recibido no son lo suficientemente amplias como para permitirnos dictar este curso en todas las ubicaciones. edX lamenta profundamente que las sanciones estadounidenses impidan que ofrezcamos todos nuestros cursos a cualquier persona, sin importar dónde viva.

Formas de realizar este curso

Elige tu camino al inscribirte.

Inscríbete ahora

Comienza el 11 abr

	Verified Track	Audit Track
Costo	149 US$	Free
Acceso a los materiales del curso	Ilimitado	Limitado Caduca el 6 jun
World class institutions and universities
Asistencia de edX
Certificado para compartir al finalizar
Tareas con calificación y exámenes

Visita la sección de Preguntas frecuentes con preguntas frecuentes sobre estas modalidades.

¿Te interesa este curso para tu negocio o equipo?

Capacita a tus empleados en los temas más solicitados con edX para Negocios.

Comprar

Solicitar información

HarvardX: Principles, Statistical and Computational Tools for Reproducible Data Science

Hay una sesión disponible:

Principles, Statistical and Computational Tools for Reproducible Data Science

Inscríbete ahora

Sobre este curso

De un vistazo

Lo que aprenderás

Plan de estudios

¿Quién puede hacer este curso?

Formas de realizar este curso

Inscríbete ahora

Verified Track

Audit Track

Costo

Acceso a los materiales del curso

World class institutions and universities

Asistencia de edX

Certificado para compartir al finalizar

Tareas con calificación y exámenes

¿Te interesa este curso para tu negocio o equipo?