Gain hands-on experience of Python programming with industry-standard machine learning techniques using pandas, scikit-learn, and XGBoost Key FeaturesThink critically about data and use it to form and test a hypothesisChoose an appropriate machine learning model and train it on your dataCommunicate data-driven insights with confidence and clarityBook Description If data is the new oil, then machine learning is the drill. As companies gain access to ever-increasing quantities of raw data, the ability to deliver state-of-the-art predictive models that support business decision-making becomes more and more valuable. In this book, you'll work on an end-to-end project based around a realistic data set and split up into bite-sized practical exercises. This creates a case-study approach that simulates the working conditions you'll experience in real-world data science projects. You'll learn how to use key Python packages, including pandas, Matplotlib, and scikit-learn, and master the process of data exploration and data processing, before moving on to fitting, evaluating, and tuning algorithms such as regularized logistic regression and random forest. Now in its second edition, this book will take you through the end-to-end process of exploring data and delivering machine learning models. Updated for 2021, this edition includes brand new content on XGBoost, SHAP values, algorithmic fairness, and the ethical concerns of deploying a model in the real world. By the end of this data science book, you'll have the skills, understanding, and confidence to build your own machine learning models and gain insights from real data. What you will learnLoad, explore, and process data using the pandas Python packageUse Matplotlib to create compelling data visualizationsImplement predictive machine learning models with scikit-learnUse lasso and ridge regression to reduce model overfittingEvaluate random forest and logistic regression model performanceDeliver business insights by presenting clear, convincing conclusionsWho this book is for Data Science Projects with Python – Second Edition is for anyone who wants to get started with data science and machine learning. If you're keen to advance your career by using data analysis and predictive modeling to generate business insights, then this book is the perfect place to begin. To quickly grasp the concepts covered, it is recommended that you have basic experience of programming with Python or another similar language, and a general interest in statistics.
Gain hands-on experience with industry-standard data analysis and machine learning tools in Python Key FeaturesTackle data science problems by identifying the problem to be solvedIllustrate patterns in data using appropriate visualizationsImplement suitable machine learning algorithms to gain insights from dataBook Description Data Science Projects with Python is designed to give you practical guidance on industry-standard data analysis and machine learning tools, by applying them to realistic data problems. You will learn how to use pandas and Matplotlib to critically examine datasets with summary statistics and graphs, and extract the insights you seek to derive. You will build your knowledge as you prepare data using the scikit-learn package and feed it to machine learning algorithms such as regularized logistic regression and random forest. You’ll discover how to tune algorithms to provide the most accurate predictions on new and unseen data. As you progress, you’ll gain insights into the working and output of these algorithms, building your understanding of both the predictive capabilities of the models and why they make these predictions. By then end of this book, you will have the necessary skills to confidently use machine learning algorithms to perform detailed data analysis and extract meaningful insights from unstructured data. What you will learnInstall the required packages to set up a data science coding environmentLoad data into a Jupyter notebook running PythonUse Matplotlib to create data visualizationsFit machine learning models using scikit-learnUse lasso and ridge regression to regularize your modelsCompare performance between models to find the best outcomesUse k-fold cross-validation to select model hyperparametersWho this book is for If you are a data analyst, data scientist, or business analyst who wants to get started using Python and machine learning techniques to analyze data and predict outcomes, this book is for you. Basic knowledge of Python and data analytics will help you get the most from this book. Familiarity with mathematical concepts such as algebra and basic statistics will also be useful.
Data Science Bookcamp is a comprehensive set of challenging projects carefully designed to grow your data science skills from novice to master. Learn data science with Python by building five real-world projects! In Data Science Bookcamp you’ll test and build your knowledge of Python and learn to handle the kind of open-ended problems that professional data scientists work on daily. Data Science Bookcamp is a comprehensive set of challenging projects carefully designed to grow your data science skills from novice to master. Veteran data scientist Leonard Apeltsin sets five increasingly difficult exercises that test your abilities against the kind of problems you’d encounter in the real world. As you solve each challenge, you’ll acquire and expand the data science and Python skills you’ll use as a professional data scientist. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
Learn to effectively manage data and execute data science projects from start to finish using Python Key FeaturesUnderstand and utilize data science tools in Python, such as specialized machine learning algorithms and statistical modelingBuild a strong data science foundation with the best data science tools available in PythonAdd value to yourself, your organization, and society by extracting actionable insights from raw dataBook Description Practical Data Science with Python teaches you core data science concepts, with real-world and realistic examples, and strengthens your grip on the basic as well as advanced principles of data preparation and storage, statistics, probability theory, machine learning, and Python programming, helping you build a solid foundation to gain proficiency in data science. The book starts with an overview of basic Python skills and then introduces foundational data science techniques, followed by a thorough explanation of the Python code needed to execute the techniques. You'll understand the code by working through the examples. The code has been broken down into small chunks (a few lines or a function at a time) to enable thorough discussion. As you progress, you will learn how to perform data analysis while exploring the functionalities of key data science Python packages, including pandas, SciPy, and scikit-learn. Finally, the book covers ethics and privacy concerns in data science and suggests resources for improving data science skills, as well as ways to stay up to date on new data science developments. By the end of the book, you should be able to comfortably use Python for basic data science projects and should have the skills to execute the data science process on any data source. What you will learnUse Python data science packages effectivelyClean and prepare data for data science work, including feature engineering and feature selectionData modeling, including classic statistical models (such as t-tests), and essential machine learning algorithms, such as random forests and boosted modelsEvaluate model performanceCompare and understand different machine learning methodsInteract with Excel spreadsheets through PythonCreate automated data science reports through PythonGet to grips with text analytics techniquesWho this book is for The book is intended for beginners, including students starting or about to start a data science, analytics, or related program (e.g. Bachelor’s, Master’s, bootcamp, online courses), recent college graduates who want to learn new skills to set them apart in the job market, professionals who want to learn hands-on data science techniques in Python, and those who want to shift their career to data science. The book requires basic familiarity with Python. A "getting started with Python" section has been included to get complete novices up to speed.
Understand the constructs of the Python programming language and use them to build data science projects Key Features Learn the basics of developing applications with Python and deploy your first data application Take your first steps in Python programming by understanding and using data structures, variables, and loops Delve into Jupyter, NumPy, Pandas, SciPy, and sklearn to explore the data science ecosystem in Python Book Description Python is the most widely used programming language for building data science applications. Complete with step-by-step instructions, this book contains easy-to-follow tutorials to help you learn Python and develop real-world data science projects. The “secret sauce” of the book is its curated list of topics and solutions, put together using a range of real-world projects, covering initial data collection, data analysis, and production. This Python book starts by taking you through the basics of programming, right from variables and data types to classes and functions. You’ll learn how to write idiomatic code and test and debug it, and discover how you can create packages or use the range of built-in ones. You’ll also be introduced to the extensive ecosystem of Python data science packages, including NumPy, Pandas, scikit-learn, Altair, and Datashader. Furthermore, you’ll be able to perform data analysis, train models, and interpret and communicate the results. Finally, you’ll get to grips with structuring and scheduling scripts using Luigi and sharing your machine learning models with the world as a microservice. By the end of the book, you’ll have learned not only how to implement Python in data science projects, but also how to maintain and design them to meet high programming standards. What you will learn Code in Python using Jupyter and VS Code Explore the basics of coding – loops, variables, functions, and classes Deploy continuous integration with Git, Bash, and DVC Get to grips with Pandas, NumPy, and scikit-learn Perform data visualization with Matplotlib, Altair, and Datashader Create a package out of your code using poetry and test it with PyTest Make your machine learning model accessible to anyone with the web API Who this book is for If you want to learn Python or data science in a fun and engaging way, this book is for you. You’ll also find this book useful if you’re a high school student, researcher, analyst, or anyone with little or no coding experience with an interest in the subject and courage to learn, fail, and learn from failing. A basic understanding of how computers work will be useful.
Data Science Projects with Python will help you get comfortable with using the Python environment for data science. This book will start you on your journey to mastering topics within machine learning. These skills will help you deliver the kind of state-of-the-art predictive models that are being used to deliver value to businesses across ...
Create, deploy, and test your Python applications, analyses, and models with ease using Streamlit Key FeaturesLearn how to showcase machine learning models in a Streamlit application effectively and efficientlyBecome an expert Streamlit creator by getting hands-on with complex application creationDiscover how Streamlit enables you to create and deploy apps effortlesslyBook Description Streamlit shortens the development time for the creation of data-focused web applications, allowing data scientists to create web app prototypes using Python in hours instead of days. Getting Started with Streamlit for Data Science takes a hands-on approach to helping you learn the tips and tricks that will have you up and running with Streamlit in no time. You'll start with the fundamentals of Streamlit by creating a basic app and gradually build on the foundation by producing high-quality graphics with data visualization and testing machine learning models. As you advance through the chapters, you'll walk through practical examples of both personal data projects and work-related data-focused web applications, and get to grips with more challenging topics such as using Streamlit Components, beautifying your apps, and quick deployment of your new apps. By the end of this book, you'll be able to create dynamic web apps in Streamlit quickly and effortlessly using the power of Python. What you will learnSet up your first development environment and create a basic Streamlit app from scratchExplore methods for uploading, downloading, and manipulating data in Streamlit appsCreate dynamic visualizations in Streamlit using built-in and imported Python librariesDiscover strategies for creating and deploying machine learning models in StreamlitUse Streamlit sharing for one-click deploymentBeautify Streamlit apps using themes, Streamlit Components, and Streamlit sidebarImplement best practices for prototyping your data science work with StreamlitWho this book is for This book is for data scientists and machine learning enthusiasts who want to create web apps using Streamlit. Whether you're a junior data scientist looking to deploy your first machine learning project in Python to improve your resume or a senior data scientist who wants to use Streamlit to make convincing and dynamic data analyses, this book will help you get there! Prior knowledge of Python programming will assist with understanding the concepts covered.
Step-by-step guide to practising data science techniques with Jupyter notebooksDescription Modern businesses are awash with data, making data driven decision-making tasks increasingly complex. As a result, relevant technical expertise and analytical skills are required to do such tasks. This book aims to equip you with just enough knowledge of Python in conjunction with skills to use powerful tool such as Jupyter Notebook in order to succeed in the role of a data scientist. The book starts with a brief introduction to the world of data science and the opportunities you may come across along with an overview of the key topics covered in the book. You will learn how to setup Anaconda installation which comes with Jupyter and preinstalled Python packages. Before diving in to several supervised, unsupervised and other machine learning techniques, you’ll learn how to use basic data structures, functions, libraries and packages required to import, clean, visualize and process data. Several machine learning techniques such as regression, classification, clustering, time-series etc have been explained with the use of practical examples and by comparing the performance of various models. By the end of the book, you will come across few case studies to put your knowledge to practice and solve real-life business problems such as building a movie recommendation engine, classifying spam messages, predicting the ability of a borrower to repay loan on time and time series forecasting of housing prices. Remember to practice additional examples provided in the code bundle of the book to master these techniques. Audience The book is intended for anyone looking for a career in data science, all aspiring data scientists who want to learn the most powerful programming language in Machine Learning or working professionals who want to switch their career in Data Science. While no prior knowledge of Data Science or related technologies is assumed, it will be helpful to have some programming experience. Key Features · Acquire Python skills to do independent data science projects · Learn the basics of linear algebra and statistical science in Python way · Understand how and when they're used in data science · Build predictive models, tune their parameters and analyze performance in few steps · Cluster, transform, visualize, and extract insights from unlabelled datasets · Learn how to use matplotlib and seaborn for data visualization · Implement and save machine learning models for real-world business scenarios Table of Contents 1 ) Data Science Fundamentals 2 ) Installing Software and Setting up 3 ) Lists and Dictionaries 4 ) Function and Packages 5 ) NumPy Foundation 6 ) Pandas and Dataframe 7 ) Interacting with Databases 8 ) Thinking Statistically in Data Science 9 ) How to import data in Python? 10 ) Cleaning of imported data 11 ) Data Visualization 12 ) Data Pre-processing 13 ) Supervised Machine Learning 14 ) Unsupervised Machine Learning 15 ) Handling Time-Series Data 16 ) Time-Series Methods 17 ) Case Study – 1 18 ) Case Study – 2 19 ) Case Study – 3 20 ) Case Study – 4
Summary Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you're already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. You'll find registration instructions inside the print book. About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you'll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you'll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. Table of Contents PART 1 - The Building Blocks of scalable computing Why scalable computing matters Introducing Dask PART 2 - Working with Structured Data using Dask DataFrames Introducing Dask DataFrames Loading data into DataFrames Cleaning and transforming DataFrames Summarizing and analyzing DataFrames Visualizing DataFrames with Seaborn Visualizing location data with Datashader PART 3 - Extending and deploying Dask Working with Bags and Arrays Machine learning with Dask-ML Scaling and deploying Dask
If you are an aspiring data scientist and you have at least a working knowledge of data analysis and Python, this book will get you started in data science. Data analysts with experience of R or MATLAB will also find the book to be a comprehensive reference to enhance their data manipulation and machine learning skills.