Tải bản đầy đủ (.pdf) (723 trang)

data analysis with pandas (2019)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (16.72 MB, 723 trang )


Hands-On Data Analysis with
Pandas

Efficiently perform data collection, wrangling, analysis,
and visualization using Python

Stefanie Molin

BIRMINGHAM - MUMBAI


Hands-On Data Analysis with Pandas
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, without the prior written permission of the publisher, except in the case of brief
quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information
presented. However, the information contained in this book is sold without warranty, either express or
implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any
damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products
mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the
accuracy of this information.
Commissioning Editor: Sunith Shetty
Acquisition Editor: Devika Battike
Content Development Editor: Athikho Sapuni Rishana
Senior Editor: Martin Whittemore
Technical Editor: Vibhuti Gawde
Copy Editor: Safis Editing
Project Coordinator: Kirti Pisat


Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Arvindkumar Gupta

First published: July 2019
Production reference: 2160919
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78961-532-6

www.packtpub.com


When I think back on all I have accomplished, I know that I couldn't have done it without
the support and love of my parents. This book is dedicated to both of you: to Mom, for
always believing in me and teaching me to believe in myself. I know I can do anything I
set my mind to because of you. And to Dad, for never letting me skip school and sharing
a countdown with me.


Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos,
as well as industry leading tools to help you plan your personal development and
advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and
Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF
and ePub files available? You can upgrade to the eBook version at www.packt.com and
as a print book customer, you are entitled to a discount on the eBook copy. Get in
touch with us at for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for
a range of free newsletters, and receive exclusive discounts and offers on Packt books
and eBooks.


Foreword
Recent advancements in computing and artificial intelligence have completely
changed the way we understand the world. Our current ability to record and analyze
data has already transformed industries and inspired big changes in society.
Stefanie Molin's Hands-On Data Analysis with Pandas is much more than an
introduction to the subject of data analysis or the pandas Python library; it's a guide
to help you become part of this transformation.
Not only will this book teach you the fundamentals of using Python to collect,
analyze, and understand data, but it will also expose you to important software
engineering, statistical, and machine learning concepts that you will need to be
successful.
Using examples based on real data, you will be able to see firsthand how to apply
these techniques to extract value from data. In the process, you will learn important
software development skills, including writing simulations, creating your own

Python packages, and collecting data from APIs.
Stefanie possesses a rare combination of skills that makes her uniquely qualified to
guide you through this process. Being both an expert data scientist and a strong
software engineer, she can not only talk authoritatively about the intricacies of the
data analysis workflow, but also about how to implement it correctly and efficiently
in Python.
Whether you are a Python programmer interested in learning more about data
analysis, or a data scientist learning how to work in Python, this book will get you up
to speed fast, so you can begin to tackle your own data analysis projects right away.
Felipe Moreno
New York, June 10, 2019.
Felipe Moreno has been working in information security for the last two decades. He currently
works for Bloomberg LP, where he leads the Security Data Science team within the Chief
Information Security Office, and focuses on applying statistics and machine learning to
security problems.


Contributors
About the author
Stefanie Molin is a data scientist and software engineer at Bloomberg LP in NYC,
tackling tough problems in information security, particularly revolving around
anomaly detection, building tools for gathering data, and knowledge sharing. She has
extensive experience in data science, designing anomaly detection solutions, and
utilizing machine learning in both R and Python in the AdTech and FinTech
industries. She holds a B.S. in operations research from Columbia University's Fu
Foundation School of Engineering and Applied Science, with minors in economics,
and entrepreneurship and innovation. In her free time, she enjoys traveling the world,
inventing new recipes, and learning new languages spoken among both people and
computers.
Writing this book was a tremendous amount of work, but I have grown a lot

through the experience: as a writer, as a technologist, and as a person. This
wouldn't have been possible without the help of my friends, family, and colleagues.
I'm very grateful to you all. In particular, I want to thank Aliki Mavromoustaki,
Felipe Moreno, Suphannee Sivakorn, Lucy Hao, Javon Thompson, Alexander
Comerford, and Ryan Molin. (The full version of my acknowledgments can be found
on my GitHub; see the preface for the link.)


About the reviewer
Aliki Mavromoustaki is the lead data scientist at Tasman Analytics. She works with
direct-to-consumer companies to deliver scalable infrastructure and implement eventdriven analytics. Previously, she worked at Criteo, an AdTech company that employs
machine learning to help digital commerce companies target valuable customers.
Aliki worked on optimizing marketing campaigns and designed statistical
experiments comparing Criteo products. Aliki holds a PhD in fluid dynamics from
Imperial College London, and was an assistant adjunct professor in applied
mathematics at UCLA.

Packt is searching for authors like you
If you're interested in becoming an author for Packt, please visit
authors.packtpub.com and apply today. We have worked with thousands of
developers and tech professionals, just like you, to help them share their insight with
the global tech community. You can make a general application, apply for a specific
hot topic that we are recruiting an author for, or submit your own idea.


Table of Contents
Preface

1


Section 1: Getting Started with Pandas
Chapter 1: Introduction to Data Analysis
Chapter materials
Fundamentals of data analysis
Data collection
Data wrangling
Exploratory data analysis
Drawing conclusions

Statistical foundations
Sampling
Descriptive statistics

Measures of central tendency
Mean
Median
Mode

Measures of spread

Range
Variance
Standard deviation
Coefficient of variation
Interquartile range
Quartile coefficient of dispersion

Summarizing data
Common distributions
Scaling data

Quantifying relationships between variables
Pitfalls of summary statistics

Prediction and forecasting
Inferential statistics

Setting up a virtual environment
Virtual environments
venv

Windows
Linux/macOS

Anaconda

Installing the required Python packages
Why pandas?
Jupyter Notebooks
Launching JupyterLab
Validating the virtual environment

8
9
11
12
13
14
15
16
17

18
18
18
19
19
20
20
20
21
22
22
23
23
27
29
29
31
33
37
39
39
40
40
41
42
43
43
44
44
46



Table of Contents

Closing JupyterLab

Summary
Exercises
Further reading
Chapter 2: Working with Pandas DataFrames
Chapter materials
Pandas data structures
Series
Index
DataFrame

Bringing data into a pandas DataFrame
From a Python object
From a file
From a database
From an API

Inspecting a DataFrame object

Examining the data
Describing and summarizing the data

Grabbing subsets of the data
Selection
Slicing

Indexing
Filtering

Adding and removing data
Creating new data
Deleting unwanted data

Summary
Exercises
Further reading

47
48
48
50
52
53
54
58
59
61
64
65
69
73
75
79
79
83
87

87
90
91
94
101
101
110
114
114
115

Section 2: Using Pandas for Data Analysis
Chapter 3: Data Wrangling with Pandas
Chapter materials
What is data wrangling?
Data cleaning
Data transformation

The wide data format
The long data format

Data enrichment

Collecting temperature data
Cleaning up the data
Renaming columns
Type conversion

[ ii ]


117
118
120
121
121
123
125
129
129
141
141
143


Table of Contents

Reordering, reindexing, and sorting data

Restructuring the data
Pivoting DataFrames
Melting DataFrames

Handling duplicate, missing, or invalid data
Finding the problematic data
Mitigating the issues

Summary
Exercises
Further reading
Chapter 4: Aggregating Pandas DataFrames

Chapter materials
Database-style operations on DataFrames
Querying DataFrames
Merging DataFrames

DataFrame operations

Arithmetic and statistics
Binning and thresholds
Applying functions
Window calculations
Pipes

Aggregations with pandas and numpy
Summarizing DataFrames
Using groupby
Pivot tables and crosstabs

Time series

Time-based selection and filtering
Shifting for lagged data
Differenced data
Resampling
Merging

Summary
Exercises
Further reading
Chapter 5: Visualizing Data with Pandas and Matplotlib

Chapter materials
An introduction to matplotlib
The basics
Plot components
Additional options

Plotting with pandas

Evolution over time
Relationships between variables

[ iii ]

150
161
164
170
173
174
181
190
190
191
192
193
194
195
197
208
208

211
216
219
222
225
227
228
234
240
240
245
246
247
252
254
255
256
257
258
259
259
265
269
271
273
277


Table of Contents


Distributions
Counts and frequencies

The pandas.plotting subpackage
Scatter matrices
Lag plots
Autocorrelation plots
Bootstrap plots

Summary
Exercises
Further reading
Chapter 6: Plotting with Seaborn and Customization Techniques
Chapter materials
Utilizing seaborn for advanced plotting
Categorical data
Correlations and heatmaps
Regression plots
Distributions
Faceting

Formatting

Titles and labels
Legends
Formatting axes

Customizing visualizations
Adding reference lines
Shading regions

Annotations
Colors

Summary
Exercises
Further reading

283
291
300
300
303
305
306
307
308
309
310
311
312
312
315
324
329
330
332
332
335
337
341

342
347
350
352
365
365
366

Section 3: Applications - Real-World Analyses
Using Pandas
Chapter 7: Financial Analysis - Bitcoin and the Stock Market
Chapter materials
Building a Python package
Package structure
Overview of the stock_analysis package

Data extraction with pandas

The StockReader class
Bitcoin historical data from HTML
S&P 500 historical data from Yahoo! Finance
FAANG historical data from IEX

[ iv ]

369
370
372
372
374

376
376
384
385
386


Table of Contents

Exploratory data analysis

The Visualizer class family
Visualizing a stock
Visualizing multiple assets

Technical analysis of financial instruments
The StockAnalyzer class
The AssetGroupAnalyzer class
Comparing assets

Modeling performance

The StockModeler class
Time series decomposition
ARIMA
Linear regression with statsmodels
Comparing models

Summary
Exercises

Further reading
Chapter 8: Rule-Based Anomaly Detection
Chapter materials
Simulating login attempts
Assumptions
The login_attempt_simulator package
Helper functions
The LoginAttemptSimulator class

Simulating from the command line

Exploratory data analysis
Rule-based anomaly detection
Percent difference
Tukey fence
Z-score
Evaluating performance

Summary
Exercises
Further reading

387
391
404
412
419
420
427
429

433
433
439
440
442
444
447
447
448
450
451
451
451
453
453
455
466
471
483
484
488
490
491
498
499
499

Section 4: Introduction to Machine Learning with
Scikit-Learn
Chapter 9: Getting Started with Machine Learning in Python

Chapter materials
Learning the lingo
Exploratory data analysis
Red wine quality data
White and red wine chemical properties data

[v]

502
503
504
507
507
510


Table of Contents

Planets and exoplanets data

Preprocessing data

Training and testing sets
Scaling and centering data
Encoding data
Imputing
Additional transformers
Pipelines

Clustering


k-means

Grouping planets by orbit characteristics
Elbow point method for determining k
Interpreting centroids and visualizing the cluster space

Evaluating clustering results

Regression

Linear regression

Predicting the length of a year on a planet
Interpreting the linear regression equation
Making predictions

Evaluating regression results
Analyzing residuals
Metrics

Classification

Logistic regression

Predicting red wine quality
Determining wine type by chemical properties

Evaluating classification results
Confusion matrix

Classification metrics

Accuracy and error rate
Precision and recall
F score
Sensitivity and specificity

ROC curve
Precision-recall curve

Summary
Exercises
Further reading
Chapter 10: Making Better Predictions - Optimizing Models
Chapter materials
Hyperparameter tuning with grid search
Feature engineering
Interaction terms and polynomial features
Dimensionality reduction
Feature unions
Feature importances

[ vi ]

513
520
520
523
525
528

530
531
533
534
535
537
539
542
544
544
545
546
547
549
549
551
555
556
557
558
559
559
562
562
563
565
566
567
571
574

575
577
579
580
581
590
591
593
602
603


Table of Contents

Ensemble methods
Random forest
Gradient boosting
Voting

Inspecting classification prediction confidence
Addressing class imbalance
Under-sampling
Over-sampling

Regularization
Summary
Exercises
Further reading
Chapter 11: Machine Learning Anomaly Detection
Chapter materials

Exploring the data
Unsupervised methods
Isolation forest
Local outlier factor
Comparing models

Supervised methods
Baselining

Dummy classifier
Naive Bayes

Logistic regression

Online learning

Creating the PartialFitPipeline subclass
Stochastic gradient descent classifier
Building our initial model
Evaluating the model
Updating the model
Presenting our results
Further improvements

Summary
Exercises
Further reading

606
607

608
610
612
615
618
619
621
623
624
627
629
630
632
640
641
644
645
650
651
651
654
658
660
661
662
663
664
670
671
673

674
675
676

Section 5: Additional Resources
Chapter 12: The Road Ahead
Data resources

678
679
679
679
680
680
680

Python packages
Seaborn
Scikit-learn

Searching for data
APIs

[ vii ]


Table of Contents

Websites


681
681
681
682
682
682
683
683
684
685
686
686

Finance
Government data
Health and economy
Social networks
Sports
Miscellaneous

Practicing working with data
Python practice
Summary
Exercises
Further reading
Solutions

691

Appendix

Data analysis workflow
Choosing the appropriate visualization
Machine learning workflow
Other Books You May Enjoy

692
692
693
694

Index

698

[ viii ]

695


Preface
Data science is often described as an interdisciplinary field where programming
skills, statistical know-how, and domain knowledge intersect. It has quickly become
one of the hottest fields of our society, and knowing how to work with data has
become essential in today's careers. Regardless of the industry, role, or project, data
skills are in high demand, and learning data analysis is the key to making an impact.
Fields in data science cover many different aspects of the spectrum: data analysts
focus more on extracting business insights, while data scientists focus more on
applying machine learning techniques to the business's problems. Data engineers
focus on designing, building, and maintaining data pipelines used by data analysts
and scientists. Machine learning engineers share much of the skill set of the data

scientist and, like data engineers, are adept software engineers. The data science
landscape encompasses many fields, but for all of them, data analysis is a
fundamental building block. This book will give you the skills to get started,
wherever your journey may take you.
The traditional skill set in data science involves knowing how to collect data from
various sources, such as databases and APIs, and process it. Python is a popular
language for data science that provides the means to collect and process data, as well
as to build production-quality data products. Since it is open source, it is easy to get
started with data science by taking advantage of the libraries written by others to
solve common data tasks and issues.
Pandas is the powerful and popular library synonymous with data science in
Python. This book will give you a hands-on introduction to data analysis using
pandas on real-world datasets, such as those dealing with the stock market, simulated
hacking attempts, weather trends, earthquakes, wine, and astronomical data. Pandas
makes data wrangling and visualization easy by giving us the ability to work
efficiently with tabular data.
Once we have learned how to conduct data analysis, we will explore a number of
applications. We will build Python packages and try our hand at stock analysis,
anomaly detection, regression, clustering, and classification with the help
of additional libraries commonly used for data visualization, data wrangling, and
machine learning, such as Matplotlib, Seaborn, NumPy, and Scikit-Learn. By the time
you finish this book, you will be well-equipped to take on your own data science
projects in Python.


Preface

Who this book is for
This book is written for people with varying levels of experience who want to learn
data science in Python, perhaps to apply it to a project, collaborate with data

scientists, and/or progress to working on machine learning production code with
software engineers. You will get the most out of this book if your background is
similar to one (or both) of the following:
You have prior data science experience in another language, such as R,
SAS, or MATLAB, and want to learn pandas in order to move your
workflow to Python.
You have some Python experience and are looking to learn about data
science using Python.

What this book covers
Chapter 1, Introduction to Data Analysis, teaches you the fundamentals of data

analysis, gives you a foundation in statistics, and guides you through getting your
environment set up for working with data in Python and using Jupyter Notebooks.
Chapter 2, Working with Pandas DataFrames, introduces you to the pandas library and

shows you the basics of working with DataFrames.

Chapter 3, Data Wrangling with Pandas, discusses the process of data manipulation,

shows you how to explore an API to gather data, and guides you through data
cleaning and reshaping with pandas.

Chapter 4, Aggregating Pandas DataFrames, teaches you how to query and merge

DataFrames, perform complex operations on them, including rolling calculations

and aggregations, and how to work effectively with time series data.

Chapter 5, Visualizing Data with Pandas and Matplotlib, shows you how to create your


own data visualizations in Python, first using the matplotlib library, and then from
pandas objects directly.
Chapter 6, Plotting with Seaborn and Customization Techniques, continues the

discussion on data visualization by teaching you how to use the seaborn library to
visualize your long-form data and giving you the tools you need to customize your
visualizations, making them presentation-ready.

[2]


Preface
Chapter 7, Financial Analysis – Bitcoin and the Stock Market, walks you through the

creation of a Python package for analyzing stocks, building upon everything learned
from Chapter 1, Introduction to Data Analysis, through Chapter 6, Plotting with
Seaborn and Customization Techniques, and applying it to a financial application.
Chapter 8, Rule-Based Anomaly Detection, covers simulating data and applying
everything learned from Chapter 1, Introduction to Data Analysis, through Chapter
6, Plotting with Seaborn and Customization Techniques, to catch hackers attempting to

authenticate to a website, using rule-based strategies for anomaly detection.

Chapter 9, Getting Started with Machine Learning in Python, introduces you to machine

learning and building models using the scikit-learn library.

Chapter 10, Making Better Predictions – Optimizing Models, shows you strategies for


tuning and improving the performance of your machine learning models.

Chapter 11, Machine Learning Anomaly Detection, revisits anomaly detection on login

attempt data, using machine learning techniques, all while giving you a taste of how
the workflow looks in practice.
Chapter 12, The Road Ahead, contains resources for taking your skills to the next level

and further avenues for exploration.

To get the most out of this book
You should be familiar with Python, particularly Python 3 and up. You should also
know how to write functions and basic scripts in Python, understand standard
programming concepts such as variables, data types, and control flow (if/else,
for/while loops), and be able to use Python as a functional programming language.
Some basic knowledge of object-oriented programming may be helpful, but is not
necessary. If your Python prowess isn't yet at this level, the Python documentation
includes a helpful tutorial for quickly getting up to speed: https:/​/​docs.​python.
org/​3/​tutorial/​index.​html.
The accompanying code for the book can be found on GitHub at https:/​/​github.
com/​stefmolin/​Hands-​On-​Data-​Analysis-​with-​Pandas. To get the most out of the
book, you should follow along in the Jupyter Notebooks as you read through each
chapter. We will cover setting up your environment and obtaining these files in
Chapter 1, Introduction to Data Analysis.

[3]


Preface


Lastly, be sure to do the exercises at the end of each chapter. Some of them may be
quite difficult, but they will make you much stronger with the material. Solutions for
each chapter's exercises can be found at https:/​/​github.​com/​stefmolin/​Hands-​OnData-​Analysis-​with-​Pandas/​tree/​master/​solutions in their respective folders.

Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in
this book. You can download it here:
/>
Conventions used
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names,

filenames, file extensions, pathnames, dummy URLs, and user input. Here is an
example: "Use pip to install the packages in the requirements.txt file."
A block of code is set as follows. The start of the line will be preceded by >>> and
continuations of that line will be preceded by ...:
>>> import pandas as pd
>>> df = pd.read_csv(
...
'data/fb_2018.csv', index_col='date', parse_dates=True
... )
>>> df.head()

Any code without the preceding >>> or ... is not something we will run—it is for
reference:
try:
del df['ones']
except KeyError:
# handle the error here
pass


[4]


Preface

When we wish to draw your attention to a particular part of a code block, the relevant
lines or items are set in bold:
>>> df.plot(
...
x='date',
...
y='price',
...
kind='line',
...
title='Price over Time',
...
legend=False,
...
ylim=(0, None)
... )

Results will be shown without anything preceding the lines:
>>> pd.Series(np.random.rand(2), name='random')
0
0.235793
1
0.257935
Name: random, dtype: float64


Any command-line input or output is written as follows:
# Windows:
C:\path\of\your\choosing> mkdir pandas_exercises
# Linux, Mac, and shorthand:
$ mkdir pandas_exercises

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the
book title in the subject of your message and email us at


[5]


Preface

Errata: Although we have taken every care to ensure the accuracy of our content,
mistakes do happen. If you have found a mistake in this book, we would be grateful if
you would report this to us. Please visit www.packt.com/submit-errata, selecting
your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet,
we would be grateful if you would provide us with the location address or website
name. Please contact us at with a link to the material.
If you are interested in becoming an author: If there is a topic that you have

expertise in and you are interested in either writing or contributing to a book, please
visit authors.packtpub.com.

Reviews
Please leave a review. Once you have read and used this book, why not leave a
review on the site that you purchased it from? Potential readers can then see and use
your unbiased opinion to make purchase decisions, we at Packt can understand what
you think about our products, and our authors can see your feedback on their book.
Thank you!
For more information about Packt, please visit packt.com.

[6]


1
Section 1: Getting Started
with Pandas
Our journey begins with an introduction to data analysis and statistics, which will lay
a strong foundation for the concepts we will cover throughout the book. Then, we
will set up our Python data science environment, which contains everything we will
need to work through the examples, and get started with learning the basics of
pandas.
The following chapters are included in this section:
Chapter 1, Introduction to Data Analysis
Chapter 2, Working with Pandas DataFrames


1
Introduction to Data Analysis
Before we can begin our hands-on introduction to data analysis with pandas, we

need to learn about the fundamentals of data analysis. Those who have ever looked at
the documentation for a software library know how overwhelming it can be if you
have no clue what you are looking for. Therefore, it is essential that we not only
master the coding aspect, but also the thought process and workflow required to
analyze data, which will prove the most useful in augmenting our skill set in the
future.
Much like the scientific method, data science has some common workflows that we
can follow when we want to conduct an analysis and present the results. The
backbone of this process is statistics, which gives us ways to describe our data, make
predictions, and also draw conclusions about it. Since prior knowledge of statistics is
not a prerequisite, this chapter will give us exposure to the statistical concepts we will
use throughout this book, as well as areas for further exploration.
After covering the fundamentals, we will get our Python environment set up for the
remainder of this book. Python is a powerful language, and its uses go way beyond
data science: building web applications, software, and web scraping, to name a few.
In order to work effectively across projects, we need to learn how to make virtual
environments, which will isolate each project's dependencies. Finally, we will learn
how to work with Jupyter Notebooks in order to follow along with the text.
The following topics will be covered in this chapter:
The core components of conducting data analysis
Statistical foundations
How to set up a Python data science environment


Introduction to Data Analysis

Chapter 1

Chapter materials
All the files for this book are on GitHub at https:/​/​github.​com/​stefmolin/​HandsOn-​Data-​Analysis-​with-​Pandas. While having a GitHub account isn't necessary to

work through this book, it is a good idea to create one, as it will serve as a portfolio
for any data/coding projects. In addition, working with Git will provide a version
control system and make collaboration easy.
Check out this article to learn some Git basics: https:/​/​www.
freecodecamp.​org/​news/​learn-​the-​basics-​of-​git-​in-​under-​10minutes-​da548267cc91/​.

In order to get a local copy of the files, we have a few options (ordered from least
useful to most useful):
Download the ZIP file and extract the files locally
Clone the repository without forking it
Fork the repository and then clone it
This book includes exercises for every chapter; therefore, for those who want to keep
a copy of their solutions along with the original content on GitHub, it is highly
recommended to fork the repository and clone the forked version. When we fork a
repository, GitHub will make a repository under our own profile with the latest
version of the original. Then, whenever we make changes to our version, we can push
the changes back up. Note that if we simply clone, we don't get this benefit.
The relevant buttons for initiating this process are circled in the following screenshot:

[9]


×