APPLICATION OF DATA MINING TOOLS FOR SELECTED SCRIPTS OF STOCK MARKET

Please download to get full document.

View again

of 9
20 views
PDF
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Document Description
One of the most important problems in modern finance is finding efficient ways to summarize and visualize the stock market data to give individuals or institutions useful information about the market behavior for investment decisions Therefore,
Document Share
Document Tags
Document Transcript
  International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.4, July 2014   DOI : 10.5121/ijdkp.2014.4405 55  A  PPLICATION   OF  D  ATA   M INING  T OOLS  F OR   S ELECTED  S CRIPTS   OF  S TOCK   M  ARKET K. S. Mahajan 1  and Dr. R. V. Kulkarni 2 1 Research student, Chh. Shahu Institute of Business Education and Research Center, Kolhapur, India 2 Professor and HOD, Chh. Shahu Institute of Business Education and Research Center, Kolhapur, India  A  BSTRACT One of the most important problems in modern finance is finding efficient ways to summarize and visualize the stock market data to give individuals or institutions useful information about the market behavior for investment decisions Therefore, Investment can be considered as one of the fundamental pillars of national economy. So, at the present time many investors look to find criterion to compare stocks together and selecting the best and also investors choose strategies that maximize the earning value of the investment  process. Therefore the enormous amount of valuable data generated by the stock market has attracted researchers to explore this problem domain using different methodologies. Therefore research in data mining has gained a high attraction due to the importance of its applications and the increasing generation information. So, Data mining tools such as association rule, rule induction method and Apriori algorithm techniques are used to find association between different scripts of stock market, and also much of the research and development has taken place regarding the reasons for fluctuating Indian stock exchange.  But, now days there are two important factors such as gold prices and US Dollar Prices are more dominating on Indian Stock Market and to find out the correlation between gold prices, dollar prices and  BSE index statistical correlation is used and this helps the activities of stock operators, brokers, investors and jobbers. They are based on the forecasting the fluctuation of index share prices, gold prices, dollar  prices and transactions of customers. Hence researcher has considered these problems as a topic for research.  K   EYWORDS Stock Market, Association Rules, Rule Induction Methods, Apriori Algorithm, Correlation, Data Mining. 1. I NTRODUCTION Data mining, the science and technology of exploring data in order to discover previously unknown patterns, is a part of the overall process of knowledge discovery in databases (KDD). In today’s computer-driven world, these databases contain massive quantities of information. The accessibility of this information makes data mining important and necessary. Data mining often can improve existing models by finding additional, important variables, indentifying interaction terms and detecting nonlinear relationships.  International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.4, July 2014 56   Financial institutions such as stock markets produce huge datasets that build a foundation for approaching these enormously complex and dynamic problems with data mining tools. Potential significant benefits of solving these problems motivated extensive research for years. Specifics of data mining in finance are coming from the need to accommodate specific efficiency criteria (e.g., the maximum of trading profit) to prediction accuracy, coordinated multiresolution forecast (minutes, days, weeks, months, and years), Be able to benefit from very subtle patterns with a short life time, and incorporate the impact of market players on market regularities , Impact of gold and US dollar prices on stock market and also to find association between different scripts of stock market which helps investors to earn more profit. The techniques that are used in this project are: 1.   Association rules 2.   Apirori algorithm 3.   Rule induction Method 4.   Statistical Correlation 1.1 Association Rule: Unlike the other data mining functions, association is transaction based. In transaction processing, a case consists of a transactions such as a market basket analysis. The collection of items in the transaction is a multi- record attributes. Association rules are IF/THEN Statements. Example: “if a customer purchases Infosys Ltd, Then customer also purchases Wipro Ltd with 60% confidence”. An association rule has two parts, an antecedent (if) and a consequent (then), an antecedent is an item found in the data. A consequent is an item that is found in combination with the antecedent. Association Rule is created by analyzing data for frequent IF/THEN patterns & and using the criteria Support & Confidence to identify the most important relationships. Support and Confidence are two measures of association rule. Association Rule take following form x=>y, where x and y are the sets of items. The goal is to discover all the rules that have the Support & Confidence greater than or equal to the minimum support and minimum confidence respectively. Steps To Generate Association Rules: 1.   Generate all possible association rules. 2.   Compute the support and confidence of all possible association rules.  International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.4, July 2014 57   3.   Apply two threshold criteria minimum support and minimum confidence to obtain association rule. 4.   Minimum support and minimum confidence is taken as an average of all the calculated support and calculated confidence. 5.   If the calculated support and confidence is greater than or equal to the minimum support and minimum confidence then these items are said to be associated with each other by association rule. SUPPORT:  The Support of a rule indicates how frequently the item in the rule occurs together. Example: Dr.Reddy’s lab and Cipla Ltd might appear together in 10% of the transaction. Support is calculated as below: Support (x=>y) = (Number of transaction Containing x&y) / (Total Number of transaction). CONFIDENCE: Confidence is the number of times the IF/THEN statements have been found to be true. The confidence of a rule indicates the probability of both the antecedent and the consequent appearing in the same transaction. Example: Dr.Reddy’s lab might appear in 20 transactions, 10 of the 20 might also include Cipla Ltd. Therefore Dr.Reddy’s Lab implies Cipla Ltd with 67% confidence. And Confidence is calculated as below: Confidence(x->y) = [Support(x->y)] / [Support of x]. Example: Association Rules from BSE SENSEX, Here Researcher has selected sector wise scripts for the calculation of association between the same sector scripts: Pharmaceuticals Sector: From BSE SENSEX researcher has selected Cipla Ltd, Dr.Reddy’s Lab, SunPharma India Ltd, Glenmark Ltd, Orchid Chemicals Ltd to calculate association between these same sector scripts. Here minimum support is the average of all the calculated support. And the MINIMUM SUPPORT: sum of support / total number of scripts =56/10 =5.6 %  International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.4, July 2014 58   So, minimum support is 5.6% MINIMUM CONFIDENCE: sum of confidence / total number of scripts = 350 / 10 =35% So minimum confidence is 35%. Researcher applied the above rule to calculate min.Support and min.Confidence to obtain result for other sector scripts. So from the above data analysis researcher can conclude that Cipla ltd and Dr.Reddy’s Ltd go hand in hand and also Dr.reddy’s lab And Sun Pharma India Ltd goes hand in hand. So researcher can say these scripts are strongly associated with each other. 2. A PRIORI  A LGORITHM : Apriori is a classical algorithm and is designed to operate on databases containing transactions. The theory of Apriori algorithm is that “All nonempty subsets of a frequent item set must also be frequent.” Apriori principle can be shown as below:  International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.4, July 2014 59   For all(x, y) :( x belongs to y) => s(x)>=s(y) i.e. support of an item set never exceeds the support of its subsets. This property is also known as monotone property of support. Algorithm is used to mine the frequent item sets. Apriori Algorithm is as follows:  – Let K=1.  – Generate frequent item sets of length l  – Repeat until no frequent item sets are identified. Example: Support count (Dr.Reddy’s pharma lab Ltd) = No of transactions containing Dr. Reddy’s Pharma ltd = 18. 3. R ULE  I NDUCTION  T ECHNIQUE Rule induction technique retrieves all interesting patterns from database. In rule induction technique, the rule if of “if this then this”. For example a rule that a stock market might find in their data collected from market transaction report would be: “if Reliance Industries Ltd script is purchased then Oil and Natural Gas Corporation is purchased”. or If Tata steel then SAIL If Mahindra then Hindustan motors In order for the rules to be useful there are two pieces on information that must be supplied as well as the actual rule: Accuracy- How often is the rule correct? Coverage- How often does the rule apply?
Similar documents
View more...
Search Related
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks