Tải bản đầy đủ (.pdf) (23 trang)

international school report introduction to data mining and analysis in business

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.04 MB, 23 trang )

<span class="text_page_counter">Trang 1</span><div class="page_container" data-page="1">

<b>VIETNAM NATIONAL UNIVERSITY, HANOI</b>

<b>INTERNATIONAL SCHOOL</b>

<b>Report Introduction to data mining and analysis in business</b>

<b>Research group: Group 4</b>

<b>Class: INS 205201</b>

<b>Group members: </b>

</div><span class="text_page_counter">Trang 2</span><div class="page_container" data-page="2">

<b>Table of contents:</b>

<b>1. The concept of datamining...3</b>

<b>2. The main functions of data mining...3</b>

<b>3. RapidMiner Data Mining General Introduction...4</b>

<b>II. Data collection...6</b>

<b>1. Overview of shares of Joint Stock Commercial Bank for Investment and Development of Vietnam (BID)...6</b>

</div><span class="text_page_counter">Trang 3</span><div class="page_container" data-page="3">

<b>1. The concept of datamining</b>

Large data sets are sorted through in data mining in order to find patterns andrelationships that may be used in data analysis to assist solve businesschallenges. Enterprises can forecast future trends and make more educatedbusiness decisions thanks to data mining techniques and technologies.

Data mining is a crucial component of data analytics as a whole and one ofthe fundamental fields in data science, which makes use of cutting-edgeanalytics methods to unearth valuable information in data sets. Data mining, at amore detailed level, is a step in the knowledge discovery in databases (KDD)procedure, a data science approach for obtaining, processing, and evaluatingdata. Although they are often used interchangeably, data mining and KDD aremore frequently understood to be separate concepts.

Extraction or mining of knowledge from vast amounts of data is what isreferred to as data mining. The work of locating and sorting through thesignificant and required items in a sizable set of given data is compared to datamining, which is often cited as an example of the extraction of gold from rocksand sand.

<b>2. The main functions of data mining</b>

Data Mining is broken down into several main directions as follows:

- Descriptive notions: in favor of elaborating on, summing up, andcharacterizing concepts.

Take text summaries, for instance.

</div><span class="text_page_counter">Trang 4</span><div class="page_container" data-page="4">

- The association rule is a relatively straightforward knowledgerepresentation rule.

For instance, "If they buy beer, 60% of men go to the supermarket, and up to80% of them will buy more bitter beef." The law of associations is frequentlyused in the financial, legal, medical, and stock market sectors,...

- Classification and forecasting: places an item in one of thepredetermined classes.

Using meteorological data to layer geographical areas is one example. Thisstrategy frequently makes use of artificial neural networks and decision treesfrom machine learning. It is also referred to as supervised learning, orstudying under supervision.

- Clustering: grouping things into unidentified clusters (both the nameand the number of the clusters). It might also be referred to as unsupervisedlearning.

- Sequence mining: a method that involves more trial and error thanassociation rule mining. Being highly predictive, this strategy is commonly usedin the financial industry and the stock market.

3. RapidMiner Data Mining General Introduction

<b>1. What is RapidMiner?</b>

• RapidMiner is open source software that offers a platform for businessanalytics, predictive modeling, text mining, data mining, and machine learning.• Research, education, training, application development, and industrialapplications are all done with RapidMiner.

</div><span class="text_page_counter">Trang 5</span><div class="page_container" data-page="5">

<b>2. Rapidminer software download instructions</b>

Go to "taimienphi.com" then find the RapidMiner software and download

<b>3. Overview interface</b>

Below is a snapshot of the main interface of RapidMiner

</div><span class="text_page_counter">Trang 6</span><div class="page_container" data-page="6">

<b>II.Data collection</b>

<b>1. Overview of shares of Joint Stock Commercial Bank for Investmentand Development of Vietnam (BID)</b>

Commercial Bank for Investment & Development of Vietnam (BIDV), a jointstock business that has been in operation for more than 66 years. It was foundedin 1957.

On January 24, 2014, BIDV formally issued shares with the stock code BID.BID's share price has generally risen steadily since its issuance, from an initialpublic offering of 18,500 VND a share to a peak on January 25, 2022 at VND49,000 per share, and it is expected to continue rising in 2023. Due to its strong

</div><span class="text_page_counter">Trang 7</span><div class="page_container" data-page="7">

base as a state-owned commercial bank, BID stock has a high level of investortrust.The State Bank of Vietnam holds a majority interest in the Joint Stock

<b>2. Data collection</b>

- Step 1 : Visit the website "vn.investing.com", then search for BID

- Step 2 Customize the time frame you want to collect data then apply :

</div><span class="text_page_counter">Trang 8</span><div class="page_container" data-page="8">

- Step 3 : Select download data

</div><span class="text_page_counter">Trang 9</span><div class="page_container" data-page="9">

<b>3. Fix data after downloading</b>

- Step 1 Select the date column, then change the cell format to date- Step 2 Use the IF function and the RIGHT function to modify the quantity

column to numeric form

III.Running Data

RapidMiner is a powerful tool for data analysis and data mining. Below is anexplanation of the steps we have performed in RapidMiner:

<b>Step 1: Launch RapidMiner:</b>

The first step is to open the RapidMiner software on your computer.

Proceed to log in to your RapidMiner account (if you don't have an account,create one in the "Create a new RapidMiner account" section).

</div><span class="text_page_counter">Trang 10</span><div class="page_container" data-page="10">

After successfully logging in, RapidMiner will have the following interface:

<b>Step 2: Create a Process:</b>

A process in RapidMiner is where you perform data analysis tasks. You cancreate a new process by clicking on the "New Process" button or selecting"File" -> "New Process".

</div><span class="text_page_counter">Trang 11</span><div class="page_container" data-page="11">

<b>Step 3: Perform Data Import:</b>

<b>In the toolbar, select "Import Data", and the screen will display "Where is YourData?". This step requires you to specify the location where you want to import</b>

the data. You can choose from the following options:

- My Computer: This option allows you to import data from a file on yourcomputer. You need to specify the specific file path.

- Database: If your data is stored in a database, you can import data from thatdatabase. You will need to provide connection information and a query toretrieve the data.

For our team's data, as the file is already saved on the computer, I will select"My Computer" to specify the location where the data I want to import is stored.

</div><span class="text_page_counter">Trang 12</span><div class="page_container" data-page="12">

<b>Select the data location</b>

This step involves specifying the exact location of the data you want to import.You need to provide the file path on your computer to indicate the location ofthe file. By selecting the "My Computer" option, you can navigate to thespecific file and provide the file path for importing the data into RapidMiner.

<b>- Select the cells to import</b>

</div><span class="text_page_counter">Trang 13</span><div class="page_container" data-page="13">

After determining the data location, you need to specify the specific cells in thedata source that you want to import. You can select cells by defining either cellranges or column ranges.

<b>Format your columns </b>

This step allows you to format the data columns after they have been imported.You provide information about the format of each column, such as string,integer, float, date, etc. This helps RapidMiner understand and correctly processthe values in the data columns.

</div><span class="text_page_counter">Trang 14</span><div class="page_container" data-page="14">

<b>Where to store the data? </b>

The final step is to determine the storage location for the imported data

<b>Step 4: Data Retrieval:</b>

</div><span class="text_page_counter">Trang 15</span><div class="page_container" data-page="15">

After the data is imported, it is represented as an ExampleSet. The ExampleSetcan be used to perform analysis and build models in your data analysis process.You can perform data exploration and analysis using various features related to

<b>the ExampleSet, such as Data, Statistics, Visualizations, and Annotations.Data:</b>

Under the Data section, you can view and manipulate the data within theExampleSet. You can view the data by examining data samples or informationabout attributes. You can perform operations like filtering, sorting, selection,and data transformations according to your analysis requirements.

Under the Statistics section, you can view statistical information about theExampleSet and its attributes. You can explore measures such as mean,variance, sum, standard deviation, correlations between attributes, and otherstatistical information. This helps you understand and analyze the distributionand relationships between data attributes

</div><span class="text_page_counter">Trang 16</span><div class="page_container" data-page="16">

The Visualizations section provides tools and charts to visually represent thedata in the ExampleSet. You can create various types of charts such as barcharts, line charts, distribution plots, correlation plots, scatter plots, and more.These visualizations help you visualize and gain a better understanding of the

<b>Annotations:</b>

</div><span class="text_page_counter">Trang 17</span><div class="page_container" data-page="17">

The Annotations section allows you to attach annotations, comments, or notes tothe ExampleSet for storing additional information or remarks about the data.You can add annotations for the entire ExampleSet or for specific data sampleswithin the ExampleSet. This helps you store relevant information and shareimportant data details with other team members working on the project.

The Data, Statistics, Visualizations, and Annotations features in RapidMinerhelp you explore, analyze, and gain a deeper understanding of the data withinthe ExampleSet. They support the data analysis process by providing tools fordata manipulation, statistical analysis, visual representations, and addingadditional context through annotations. These features enable you to uncoverpatterns, relationships, and insights in your data, facilitating the data analysisprocess.

IV.Analyzing

After running the data on rapid miner, my team got the results of aggregateddata tables and graphs. Below is a review of the charts that show BID stockprice movement information :

</div><span class="text_page_counter">Trang 18</span><div class="page_container" data-page="18">

Data:BID stock information from 2014-2023

In general, since listing on the stock exchange until now, BID stock has grownvery well, the uptrend is clearly visible.

The price of BID shares was first issued to the public in December 2012 withpar value of 10,000 VND/share, starting price is 18,500 VND/share.

The lowest BID stock price fell on December 17, 2014, December 29, 2014,December 31, 2014 at 12,700 VND/share;

The highest BID stock price fell on January 25, 2022 at VND 49,000/share.

</div><span class="text_page_counter">Trang 19</span><div class="page_container" data-page="19">

Chart: Volume of BID shares traded from 2014-2023

We can see that the volume of BID shares traded increased sharply in 2017 and2022, with a downward trend in 2015 and 2020. By the end of 2022, the amountof BID shares traded decreased again.

</div><span class="text_page_counter">Trang 20</span><div class="page_container" data-page="20">

Chart: Closing price of BID stock from 2014-2023

The closing price of the stock according to the chart above has generally tendedto increase over the years. Strong increase in 2018 and 2022. In the beginning of2023, there is a decrease like in the middle of 2023, it tends to increase again

</div><span class="text_page_counter">Trang 21</span><div class="page_container" data-page="21">

Chart: Change rate of BID stock price from 2014-2023

We can see on the chart that the stock price changed strongly in 2018, 2021 andthis year 2023. In contrast, in 2015, 2017 and early 2020, BID stock price wasquite stable.

Through analyzing stock prices, it can be seen that BID's stock price has justgone through a difficult period with strong fluctuations but is graduallystabilizing and showing signs of growth again.

To forecast stock prices in the future, our group combined the analysis of BIDstock data over the years and factors that directly affect BIDV, assessing thebusiness situation and influencing factors. Visit the banking industry, follow

</div><span class="text_page_counter">Trang 22</span><div class="page_container" data-page="22">

Firstly, BIDV's profit is expected to grow strongly thanks to improving assetquality

Currently, BIDV has settled all VAMC bonds (Vietnam Asset ManagementCompany - a company specializing in asset management of credit institutionsoperating in Vietnam). At the same time, the bank also plans to complete thesetting up and handling of all bad debts under the Restructuring Project in 2021.Since BIDV has already set aside most of the outstanding bad debts under therestructuring project, the pressure for deductions Provisions from 2022 onwardwill be significantly reduced, helping profits grow strongly. Thereby promotingthe potential to increase BIDV's stock price.

Second, credit growth is expected to be positive in 2023-2024

This is reasonable when Vietnam is witnessing a well-controlled domesticepidemic situation, the focus of development is gradually shifting to the retailsegment. In 2021, BID received a new credit growth limit of 9.5%.

The growth rate of bank credit is expected to be significantly improved whenthe epidemic in Vietnam is well controlled along with increased credit demandand the impetus from capital increase. At the same time, BIDV also restructuredits concentrated portfolio. In particular, the bank will increase the proportion ofretail and SME businesses (small and medium sized enterprises) to improvelending rates as well as limit bad debt risks.

Third, well control the cost of banking operations

BIDV's CIR (cost to income) ratio is low compared to the industry average,thanks to the application of technology in business operations as well as thesupport of strategic investors.

Fourth, the roadmap to increase capital for the period 2022-2023

In 2022 BID will increase its charter capital through a stock dividend of 12.2%and additional issuance to foreign shareholders at the rate of 8.5% in 2022. Inthe next 2 years, the state bank will reduce ownership rate down to 65% and theremaining foreign ownership rate will be 15%.

With the plan to increase charter capital through stock dividend, BIDV is in theprocess of getting approval from the government. Meanwhile, the plan toincrease charter capital through the issuance of shares to foreign shareholders,the rate of 8.5% is continuing in the negotiation stage.

BIDV's capital increase roadmap is expected to be delayed one year compared

</div><span class="text_page_counter">Trang 23</span><div class="page_container" data-page="23">

65% by the end of 2023. Thus, within the next two years, the remaining shareratio for foreign shareholders will be more than 15%, which is a relative level.attractive to attract foreign funds or another potential strategic investmentpartner in the market.

Through the above analysis of BID shares, it can be seen that BID is the leadingbank in the industry in terms of scale and market share. In addition, BIDV alsohas good resources and is in the final stage of active restructuring to improveasset quality. With BID's long-term potential, BID bank stock still has a lot ofpotential for growth.

Above is basic information about BID stock as well as BID stock valuation.With the strengths of a state-owned commercial bank, we can determine thatBID shares still have a lot of potential for growth as well as ensure a certainlevel of safety for investors.

In this report, we used rapidminer - one of the most popular data analysis toolsto analyze BID stock price movements. The report covered quite a few ways tocollect data and how run the data in rapidminer. Finally, we get the results anddo the analysis. From the analysis of historical BID stock price movements andcombined with other factors affecting BIDV's operations, our team has madepredictions about the future of BID's share price. We hope these forecasts willbe useful to BIDV's investors and shareholders in the future.

<small>www.finhay.com.vn. 2023. Cổ phiếu BID có đáng đầu tư khơng? Phân tích tiềm năngtăng trưởng và những lưu ý - Finhay. [ONLINE] Available at: [Accessed 30 May 2023].</small>

<small>vn.investing.com. 2023. Joint Stock Commercial Bank for Investment andDevelopment of Vietnam (BID) Giá Cả Lịch Sử - Investing.com. [ONLINE] Availableat: [Accessed 30 May 2023].</small>

<small>vn.tradingview.com. 2023. BID Giá cổ phiếu và Biểu đồ — HOSE:BID —TradingView. [ONLINE] Available at: [Accessed 30 May 2023].</small>

<small>topi.vn. 2023. Cổ phiếu BID và những tiềm năng đầu tư trong năm 2023. [ONLINE]Available at: [Accessed 30 May 2023].</small>

</div>

×