Skip to Main Content

Data Science

Course Descriptions

Note: Not all courses are offered every semester, and new courses may be added at any time. Check the schedule of classes, for the latest offerings.

DATA 601: Introduction to Data Science [3]

The goal of this class is to give students an introduction to and hands on experience with all phases of the data science process using real data and modern tools. Topics that will be covered include data formats, loading, and cleaning; data storage in relational and non-relational stores; data governance, data analysis using supervised and unsupervised learning using R and similar tools, and sound evaluation methods; data visualization; and scaling up with cluster computing, MapReduce, Hadoop, and Spark. Prerequisite: Enrollment in the Data Science program. Other students may be admitted with instructor permission.

DATA 602: Introduction to Data Analysis and Machine Learning [3]

This course provides a broad introduction to the practical side of machine-learning and data analysis. This course examines the end-to-end processing pipeline for extracting and identifying useful features that best represent data, a few of the most important machine algorithms, and evaluating their performance for modeling data. Topics covered include decision trees, logistic regression, linear discriminant analysis, linear and non-linear regression, basic functions, support vector machines, neural networks, Bayesian networks, bias/variance theory, ensemble methods, clustering, evaluation methodologies, and experiment design. Prerequisite: Enrollment in the Data Science program. Other students may be admitted with instructor permission. Corequisite: DATA 601: Introduction to Data Science

DATA 603: Platforms for Big Data Processing [3]

The goal of this course is to introduce methods, technologies, and computing platforms for performing data analysis at scale. Topics include the theory and techniques for data acquisition, cleansing, aggregation, management of large heterogeneous data collections, processing, information and knowledge extraction. Students are introduced to map-reduce, streaming, and external memory algorithms and their implementations using Hadoop and its eco-system (HBase, Hive, Pig and Spark). Students will gain practical experience in analyzing large existing databases. Prerequisite: Enrollment in the Data Science program. Other students may be admitted with instructor permission. Corequisite: DATA 601: Introduction to Data Science

DATA 604: Data Management [3]

This course introduces students to the data management, storage and manipulation tools common in data science. Students will get an overview of relational database management systems and various NoSQL database technologies, and apply them to real scenarios. Topics include: ER and relational data models, storage and concurrency preliminaries, relational databases and SQL queries, NoSQL databases, and Data Governance. Prerequisite: Enrollment in the Data Science program. Other students may be admitted with instructor permission. Corequisite: DATA 601: Introduction to Data Science

DATA 605: Ethical and Legal Issues in Data Science [3]

This course provides a comprehensive overview of important legal and ethical issues pertaining to the full life cycle of data science. The student learns how to think through the ethics of making decisions and inferences based on data and how important cases and laws have shaped the data science field. Students will use real and hypothetical case studies across various domains to explore these issues. Prerequisite: Enrollment in the Data Science program. Other students may be admitted with instructor permission. Corequisite: DATA 601: Introduction to Data Science

DATA 606: Capstone in Data Science [3]

This is a semi-independent course that provides the advanced graduate student in the Data Science program the opportunity to apply the knowledge, skills and tools they’ve learned to a real-world data science project. Students will work with a real data set and go through the entire process of solving a real-world data science project. The project will be conducted with industry, government and academic partners, who will be responsible for providing the data set, with guidance and feedback from the instructor. Prerequisite: Completion of the required courses.

ENMG 652: Management, Leadership and Communication [3]

Students learn effective management and communication skills through case study-analysis, reading, class discussion and role-playing. The course covers topics such as effective listening, setting expectations, delegation, coaching, performance, evaluations, conflict management, negotiation with senior management and managing with integrity.

GES 773: GIS Modeling Techniques [3]

This course addresses the concepts, tools, and techniques of GIS modeling, and presents modeling concepts and theory as well as provides opportunities for hands-on model design, construction, and application. The focus is given to model calibration and validation.

GES 774: Spatial Statistics [3]

This course investigates statistical techniques for exploring and characterizing spatial phenomena. The course covers local/global cluster analysis, spatial autocorrelation, interpolation, kriging, as well as exposure to prominent GIS statistical packages. An emphasis is placed on exploratory spatial data analysis (ESDA)to develop spatial cognition and analytical skills with practical applications to modeling spatial phenomena in computer environments.

GES 778: Advanced Visualization and Presentation [3]

Web technologies are providing increasingly sophisticated environments for visualization of spatial data. This course explores advanced techniques for visualizing multivariate and multidimensional data. Topics include advanced cartographic techniques, 3D, dynamic data update, and temporal modeling. Students will learn to create geospatial data-driven Web apps with modern technologies and open source software, including HTML5, JavaScript, and D3. Project-based learning will allow students to advance through the course at a pace that's tailored to their backgrounds. Although the course requires no advanced knowledge of Web technologies, students with previous programming experience will have a wider range of project options.

IS 721: Semi Structured Data Management [3]

Database Management Systems (DBMS) have been dominated by relational systems (RDBMS) for over 30 years. Due to changes in hardware, bandwidth, and use case, systems are changing. Multiple processors, gigabit network speeds, and the Internet as a platform for distributed systems are changing the way computing gets done. RDBMS is not being superseded, but many so-called ‘non-standard’ system architectures are now being developed and deployed for specific application classes. We will look at a developing category of such systems sometimes referred to as ‘NoSql’ systems that are becoming important for semi-structured information in web applications. We will cover current systems from conceptual and practical standpoints. We will read papers on representative systems and do simple programming against the databases. Students should have taken a relational database class, a programming class, and be familiar with elementary web development with html and javascript.

IS 722: Systems and Information Integration [3]

The integration of systems and the seamless exchange of information stored in them provides an answer to a very common problem when organizations merge and inherit information systems that are not compatible with each other. Data systems and information should easily interoperate for the success of the organization. This course investigates the various technologies in the field of information integration with an emphasis on semantic interoperation of systems. Topics that are covered include: Modeling Data Semantics, Semantic Interoperability, Metadata, Semantic Integration Patterns, Context-Awareness, Semantic Networks, Mediation and Wrapper techniques, Data Warehouses, Integration Servers, etc. Students will keep abreast of the latest technologies and research on data semantics, information integration, and also gain practical experience integrating information from disparate and heterogeneous systems.

IS 733: Data Warehousing and Data Mining [3]

The purpose of this course is to provide a comprehensive discussion on using organizational databases to enable decision support through warehousing and mining of data. This course will provide an in depth understanding of the technical, business, and research issues in each of these two areas. Issues in data warehousing include designing multi-dimensional data model, cleansing and loading of data, determining refresh cycles and methods, administrative aspects of running a data warehouse including efficient data retrieval using bitmap and join indexes, reporting, ad hoc querying, and multi-dimensional operations such as slicing, dicing, pivoting, drill-down, and roll-up operations. Areas with data mining will include justifying the need for knowledge recovery in databases, data mining methods such as clustering, classification, Bayesian networks, association rules, and visualization. New areas of research and development in data mining warehousing will also be discussed.



© UMBC Division of Professional Studies · 1000 Hilltop Circle, Sherman Hall East 4th Floor, Baltimore, MD 21250 · 410-455-2336 · dps@umbc.edu