marketing is growing faster to increase business and sale products. All
Ecommerce sites send recommendations of products or newly updated product list
to the user. They also send exciting o?ers to users. To recommend products,
understanding users interests and behaviour is essential for e-commerce
websites in order to adapt to customers requirements. The information about
users behaviour is stored into server log ?le. We extract log ?le to get
behaviour and do pre-processing to get session action details. To analyse log
?les, proposed system implements a lineartemporal logic model checking approach
for the analysis of structured e-commerce web logs. By de?ning a common way of
mapping log records according to the e-commerce structure, web logs can be
easily converted into event logs where the behaviour of users is captured.
Then, di?erent prede?ned queries can be performed to identify di?erent
behavioural patterns that consider the di?erent actions performed by a user
during a session.
Index Terms— Key words- behavioral patterns, Data mining, e-commerce, model checking, web
Data mining algorithms are used to study these web
server log files. The main approach of this kind of algorithms is to find users
behavior and to find users interest. Numbers of algorithms are proposed in
recent years for data mining in the field of e-commerce such as classification
techniques, clustering, association rules or sequential patterns. Their
techniques are used along with data mining to discover hidden patterns and
Most of the data mining techniques used now days have
some limitation in point of view to data mining for an e-commerce application.
They do not mine in the correct or proper sequence of the user’s navigation
sequence, they ignore causality relations such as users sequence, number of
pages visited, product search sequence, number of time page visited by user
etc. To limit all condition we proposed the use of Temporal Logic and model
checking techniques as an alternative to the data mining technique. The main
approach is to analyze users’ behavior on e-commerce site to discover
customers’ complex behavioral patterns by means of checking temporal logic
formulas describing such behaviors against the log model. At the start using
web server log user behavior is generated. After generation business analyst
can use set of predefined queries which help him to discover the way client use
People are buying more and more over the Internet
instead of going traditional shopping. Ecommerce provides customers with the
opportunity of browsing endless product catalogues, comparing prices, being
continuously informed, creating wish list and enjoying a better service based
on their individual interests. This increasing electronic market is highly
competitive, featuring the possibility for a customer to easily move from one
e-commerce when their necessities are not satis?ed. As a consequence,
e-commerce business analysts require to know and understand consumers behaviour
when they navigate through the website, as well as trying to identify the
reasons that motivated them to purchase or not to purchase a product. Getting
this behavioral knowledge will allow e-commerce websites to deliver a more
personalized service to customers, retaining customers and increasing bene?ts.
REVIEW OF LITERATURE
Guimei Liu, Tam T. Nguyen et.al, proposed a method called
Repeat Buyer Prediction for E-commerce. In this approach, they have generated
different profiles for users, merchants, brand, categories item and their
interactions through feature engineering. To generate all these features they
trained different classification model which mainly contains GBM, Logistic
Regression, XGBoost, Factorization Machine and Random Forest. They have also
used ensemble technique to mix the different classifier to improve the
In J Ben Schafer et.al 2, presents a study on, how
recommendation system are related to conventional database analysis technique.
They also developed one recommendation system which takes the input from users
and from databases to give a recommendation to the user.
A method for find out bunch of e-commerce interest prototype
by means of click-stream information is given in 3 by Qiang Su and Lu Chen.
Analysis of the customer’s behavior is a very important task for the business
analyst to improve the application as per target market. In this for data
analysis they use different browsing behavior such as a number of users, their
visiting sequence, time and frequency spend by the user on each category etc.
and depending on this they developed an improved clustering technique and improve
it with the set theory to generate user interest pattern.
4 Gives, Online Discovery of Declarative Process Models
from Event Streams. Most of the business process is
controlled and the process by the information system. Such kind of system
record real-time information about business processing during execution. In
this, they proposed a framework for the generation of LTL-based declarative
process models from
streaming event data. This framework continuously updates a
set of valid business constraints based on the events occurred in the event
stream. A declarative model is presented with the help of Declare, which
combines a formal schematic features
Linear Temporal Logic (LTL) on finite traces, with a graphical notation.
Leandro G. Vasconcelos et.al proposed a new e-commerce
personalized technique called Exploiting client logs to support the
construction of adaptive e-commerce applications. In this work, they first
present that it is possible to construct personalization by observing or
analyzing the customer’s behavior while browsing e-commerce web site. For this
hypothesis, they built one application which allows automatic generation and
analysis of log file on the real-time basis 5.
Generally, there is two type of log files are generated one
is a server sided log file and other is a client-side log file. The server-side
log file is automatically generated by server and client side log file is
managed accurately for user analysis. This include three stages first is data
cleaning, second is user identification and last is session identification.
Depending on these three stages G. Neelima et.al developed one application
using the Weblog mining called Predicting user behavior through Sessions using
the Weblog mining. In this method, they extract user’s information from given
log files. At the start, each user is recognized by his or her IP address and
from that their session is generated. After extracting session frequency of
user visited the particular page is extracted and from this, they analyze the
user’s behavior 6.
7 gives a method
for e-commerce data mining called Web usage mining to improve the design of an
e-commerce website. In this, they present some set of stages which includes
data collection, data processing, extraction of useful data and its analysis
etc. The useful data is extracted by supervised and unsupervised data mining
algorithms with the help of some task such as clustering, association and
subgroup discovery. The result of all this process is then discussed with
designing team to improve the e-commerce website.
Most of the studies till now for the product recommendation
only consider past purchased data i.e historical data of the users. However,
some method considered users navigational and behavioral data for the recommendation. On the basis of this concept 8, Yong Soo
Kim and Bong-Jin Yum proposed a new approach called Recommender system based on
clickstream data using association rule mining. In this novel approach, they
improve the proposed collaborative filtering method by calculating the
confidence level between clicked products, between the products placed in the
basket, and between purchased products, respectively. After calculation
confidence level, preference level is calculated by using the above confidence level
Roung-Shiunn Wu and Po-Hsuan Chou 9 proposed an approach
named Customer segmentation of multiple category data in e-commerce using a
soft-clustering approach. For the e-commerce websites, it is necessary to
analyze the user’s behavior. Online customer’s segmentation is the process of
dividing customers into multiple categories which contribute to better
characterization and understanding. Therefore to get the proper segmentation
author developed a soft clustering method which uses hidden mixed-class
membership clustering approach to differentiate users on the basis of their
purchasing data across categories. For the segmentation, they used hidden
Dirichlet allocation model.
K. Sudheer Reddy, M. Kantha Reddy el.al proposed a method
for web usage mining. Web Usage Mining is one of the categories of data mining
which identifies usage pattern to perceive and better serve the requirements of
web applications. The web usage mining consists of three
stages i.e processing, pattern discovery, and analysis. Data processing is the
main and essential process in web usage mining which helps to improve the
quality of data mining. In this study, they present different data presentation
method to access streams prior to the start of the mining process to improve
the data processing for unique identification of user and session 10.
Admin has log dataset of ecommerce website. In ecommerce website all
entries are stored in text ?le. Log dataset ?le is uploaded to system for
analysis. Admin gets analysis of the log dataset which can classify the
behaviour of user.
The raw data have relatively low business
value unless they can be transformed and processed to produce actionable
knowledge. Therefore, in order to enable the analysis, raw logs must be
pre-processed to discard uninteresting requests, to identify user sessions and
to prepare the log to enable its analysis. Analysis module has 4 main steps:
In cleaning phase we get all log data in a
system to ?nd automatic generating request from system. Log ?le has all the
daily activities of user stored in it. If user has handled any error or bug
then its register to status code of link. In cleaning session we delete some
requests that are marketing bugs means you want to search another information
but system can access another webpage linked to marketing. In this phase user
can perform only IP address related entries if any other entries are available
then it can be forcefully removed from system.
To identify direct user side request, we
check method type i.e. GET or POST method. If any IP address does not have GET
and POST method request entries then it can be directly deleted from database
or dataset. Some requests are generated by browser such as images, multimedia,
video etc. these requests are generated by browser which are deleted in log
Once log dataset cleaning phase is complete
then second phase is ?nding out session of user. Number of user entries are
available to ?le with user login time and logout time details. To ?nd
specialize user time that is how much time user was working with the system.
These session data are stored with ip address and session time as well as date.
After we complete the session phase of
every user then we ?nd total users of system. From session set we can easily
access the number of ip addresses in dataset. But in session we have multiple
user entries of same user as they login again and again in one day so we ?nd
out unique identi?cation of user by IP address. We can also get the browser
version of system. In Log dataset we have multiple user entries available,
which can ?nd user speci?cation using IP address. We ?nd user session and
session entries to analyse patterns. Multiple users has di?erent machine IP
address and di?erent browsers are used.
2.4 Log preparation
This phase works on log preparation, main
aim of this phase is to ?nd actions of user. In preparation process we work on
navigation of user means which page is visited ?rst which page is visited next
and so on. To ?nd categorization we analyze level of web pages. In a website,
web pages has sections i.e main page and its dependent child pages are
displayed in hierarchy in a website. If user is searching in one level i.e
search on main pages so he is a just browsing data. If he increases level of
pages i.e decides to the ?nd a deep knowledge or deep web searching. We can
easily ?nd a session time, timestamp, method detail like POST,GET and a speci?c
webpage name. To ?nd a speci?c name we can categorize a level of user means
trace action event. We can easily categorize user navigation path of visited
pages, a speci?c user searches a product then event is product if he gets more
detail then analyse second level of navigation then next then the next. In web
log analysis we can ?nd all action events of user.
Classify Behavioural patterns
Behavioural patterns is concept of system
which classi?es a user details. A speci?c user can be available in two or more
categories because it has searched multiple events or it has interest in
multiple events. In system we generate a main section categories of user like a
login, di?erent product name. If any user has continuously searched any speci?c
product details then we add the name in interested product list. We classify
group of data where multiple user session are available in each session how
many time user has visited that item then we categorise user session. Using
websites di?erent pages or categorization of products we can ?nd behavioural
patterns. If user searches for a speci?c product or categories multiple times
then we can ?nd user behaviours.
To ?nd users interests and behavior we can use log ?les.
Log ?le has all entries about users visited products, pages and session time.
Log ?le has complex entries di?erent patterns are available like ip address,
timestamp, browser, website link, pages name, status code, method name. To ?nd
out user behavior, we calculate all data get session wise user entries visiting
particular pages links. We can use complex web log ?le for analysis of session,
user identi?cation, visiting pages, visiting product and bugs or error. To
understand the user’s interest and behavior related to ecommerce site we can
implement LTL-based model checking techniques to analyse e-commerce web logs.
We analysis web log and can convert it into event logs to get consumer history.