<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>lee's Blog</title>
    <link>http://activerain.com/blogs/sky123</link>
    <description></description>
    <language>en-us</language>
    <item>
      <guid>http://activerain.com/blogsview/850701/how-can-i-extract-data-from-html-page-</guid>
      <title>How can I extract data from Html page&#65311;</title>
      <description>&lt;p&gt;Hi, &lt;br /&gt;I would like to know if this software enables me to extract data like : &lt;br /&gt;address, Fax, Phone or else on a web page. For exemple, I would like to extact and send to excel the Address:, Phone:, Fax:, Email:, Web:&lt;br /&gt;I would need to have one company on one line in excel with all the details. Can I do it with a software or do you know one soft that would do it?&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 24 Dec 2008 00:56:10 -0600</pubDate>
      <link>http://activerain.com/blogsview/850701/how-can-i-extract-data-from-html-page-</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/844045/web-data-mining-tools</guid>
      <title>Web Data Mining Tools</title>
      <description>&lt;p&gt;Web mining(&lt;a href=&quot;http://www.knowlesys.com/web/net_screen_scraping.htm&quot;&gt;Net Screen Scraping&lt;/a&gt;) aims to discover useful information or knowledge from the Web hyperlink structure, page content, and usage data. Although Web mining uses many data mining techniques(e.g.&lt;a href=&quot;http://www.knowlesys.com/web/offline_browser.htm&quot;&gt;Offline Browser&lt;/a&gt;), as mentioned above it is not purely an application of traditional data mining due to the heterogeneity and semi-structured or unstructured nature of the Web data. Many new mining tasks and algorithms were invented in the past decade. Based on the primary kinds of data used in the mining process, &lt;a href=&quot;http://www.knowlesys.com/web/parse_text_data.htm&quot;&gt;Parse Text Data&lt;/a&gt; tasks can be categorized into three types: Web structure mining, Web content mining and Web usage mining.-- from the book &quot;Web Data Mining&quot; or &lt;a href=&quot;http://www.knowlesys.com/web/parse_a_web_page.htm&quot;&gt;Parse A Web Page&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Knowlesys could provide custom web data mining tools for you based on your target website and requirements.&lt;/p&gt;
&lt;p&gt;Our softwares are designed for data extraction from both static and dynamic web pages. It is able to extract&amp;nbsp;&amp;nbsp;any data&amp;nbsp;(e.g.&lt;a href=&quot;http://www.knowlesys.com/web/open_source_screen_scraper.htm&quot;&gt;Open Source Screen Scraper&lt;/a&gt; )from the targeted web pages on the Internet. It is flexible enough to suit for any different web technology. (e.g. html, asp, jsp, php, cfm, aspx, jscript etc ),&lt;a href=&quot;http://www.knowlesys.com/web/pdf_data_extraction.htm&quot;&gt;Pdf Data Extraction&lt;/a&gt;.&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 19 Dec 2008 03:18:10 -0600</pubDate>
      <link>http://activerain.com/blogsview/844045/web-data-mining-tools</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/844028/get-data-from-website</guid>
      <title>Get Data From Website</title>
      <description>&lt;p&gt;Web2DB Data Service(e.g.&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper.htm&quot;&gt;Screen Scraper&lt;/a&gt;) is most convenient way to extract data from web pages in a short time. STOP wasting your time on manual COPY / PASTE work. We can deliver your desired data quickly, just the format as you want since we do the web data extraction jobs on our Blue Whale Web Data Extraction System(&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper_com.htm&quot;&gt;Screen Scraper Com&lt;/a&gt;) everyday. &lt;br /&gt;What we do? - We extract content from web pages on your targeted website and convert the raw data to structured records in rational database,&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper_holidays.htm&quot;&gt;Screen Scraper Holidays&lt;/a&gt;. We guarantee the quality of the results via our standard service process guide.&lt;/p&gt;
&lt;p&gt;What you get? &amp;ndash; Accurate, fast result as you want. The file format could be Excel, Access, CSV, Text, MS SQL, and My SQL etc. &lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper_program.htm&quot;&gt;Screen Scraper Program&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;How it works? - Our softwares are designed for data extraction from both static and dynamic web pages. They were used to help us to analyze the website, extract the data, process the data etc. You will receive the progress message everyday when we are working on your project until you receive the preview of the final data. e.g.&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper_software.htm&quot;&gt;Screen Scraper Software&lt;/a&gt;.&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 19 Dec 2008 03:01:10 -0600</pubDate>
      <link>http://activerain.com/blogsview/844028/get-data-from-website</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/844026/data-mining-for-web-intelligence</guid>
      <title>Data Mining For Web Intelligence</title>
      <description>&lt;div class=&quot;post-body&quot;&gt;
&lt;p&gt;The Web constitutes a highly dynamic information source. Not only does the Web continue to grow rapidly, the information it holds also receives constant updates. News, stock market, service center, &lt;a href=&quot;http://www.knowlesys.com/web/screen_scraping_code.htm&quot;&gt;Screen Scraping Code&lt;/a&gt;&amp;nbsp;and corporate sites revise their Web pages regularly. Linkage information and access records also undergo frequent updates.&lt;/p&gt;
&lt;p&gt;Industries&lt;/p&gt;
&lt;p&gt;We provide services or custom software(e.g.&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraper_tool.htm&quot;&gt;Screen Scraper Tool&lt;/a&gt;) to clients across all industries. Some of the industries that we have provided services and custom software (&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraping_data.htm&quot;&gt;Screen Scraping Data&lt;/a&gt;)for are:&lt;/p&gt;
&lt;p&gt;Consultant Marketing/Research &lt;br /&gt;Healthcare Retail &lt;br /&gt;Defense Manufacturing&lt;br /&gt;Software Travel &lt;br /&gt;Energy Real Estate&lt;br /&gt;Financial Aerospace&lt;/p&gt;
&lt;p&gt;Summary&lt;/p&gt;
&lt;p&gt;KnowleSys Web2DB Service (e.g.&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraping.htm&quot;&gt;Screen Scraping&lt;/a&gt;)is a low-cost way to extract critical business data&lt;br /&gt;from web sites such as contact information(company name, phone numbers, e-mail addresses, address, and hyper-links), product information(product number, product name, price, stock, description, picture) etc. Their service is a cool tool for your business.&lt;a href=&quot;http://www.knowlesys.com/web/screen_scraping_api.htm&quot;&gt;Screen Scraping Api&lt;/a&gt;&amp;nbsp;.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;/div&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 19 Dec 2008 03:00:28 -0600</pubDate>
      <link>http://activerain.com/blogsview/844026/data-mining-for-web-intelligence</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/844019/the-top-five-tips-for-crm-strategy</guid>
      <title>The top five tips for CRM strategy</title>
      <description>&lt;p&gt;&amp;nbsp;The familiar refrain of CRM failure is a hard one to avoid these days with so many industry watchers pointing to flawed strategies among customers, vendors and consultants as the reasons for an overwhelming lack of success.&lt;br /&gt;&lt;br /&gt;Researchers such as Gartner Group and Meta Group have chronicled failure rates of 55-70% for CRM implementations in general, and point to a lack of clear strategy as a key contributor to this dismal industry track record. So, it would seem only rational to turn to these same industry watchers for their answers to the obvious questions that arise out of this: which CRM strategies work, and why?&lt;br /&gt;&lt;br /&gt;To find out just what CRM strategies are paying off and why these tactics are the cornerstones to success, SearchCRM tracked down five industry experts for their best advice to the masses. What follows is the first in a three-part series on their top tips for effective CRM strategies.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tip 1: Tackle business issues before choosing a technology.&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Almost all of the experts agreed that most CRM buyers get confused into thinking that technology is the answer to their problems when in most cases, core businesses processes need to be rethought to better serve customers.&lt;br /&gt;&lt;br /&gt;&quot;CRM is not about technology. It is about the interplay of strategy, tactics, processes, skills and the technology that supports these areas. In fact, CRM can be done without technology. But it cannot scale without technology. Therefore, firms need to place the emphasis on a balance of these five areas,&quot; said Scott Nelson, analyst for Gartner Group, Stamford, Conn.&lt;br /&gt;&lt;br /&gt;Tom Topolinski, an analyst with Gartner research division Dataquest, agreed.&lt;br /&gt;&lt;br /&gt;&quot;Consider your current sales, marketing, support and service processes,&quot; he said. &quot;[Ask yourself] will this new system fit into them or mandate that they be changed? How much change can I afford? What is driving our current customer philosophy? Will the new system fit into this or mandate change to how we treat our clients?&quot;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tip 2: Explain real business needs to vendors/partners before investing.&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;In speaking to analysts about CRM's disappointment, it becomes immediately clear that a distinct lack of communication between many organizations and their vendor partners has factored heavily into the climate of disenchantment.&lt;br /&gt;&lt;br /&gt;&quot;Communicate the business issues to your prospective vendors,&quot; said David Bradshaw, an analyst at London-based researcher Ovum. &quot;The best suppliers will ask smart questions and come up with smarter solutions that you can by yourself. So don?t cramp their style by dictating technology, instead demand proposals that make business sense.&quot;&lt;br /&gt;&lt;br /&gt;Other analysts concurred that having the right game plan before committing to a vendor will equate to a much higher chance at success.&lt;br /&gt;&lt;br /&gt;&quot;Be proactive,&quot; suggested Denis Pombriant, analyst at Boston-based Aberdeen Group. &quot;Always perform a baseline study of your situation before inviting vendors into the process or perform the study as part of the sales process with a few select vendors. This process was once called needs analysis and was directed by the vendor. The result was identification of problems that the vendors felt qualified to solve which may or may not have been congruent with the actual business problems of the purchasing organization.&quot;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tip 3: Keep the customer as priority no. 1.&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;The point of CRM is getting closer to your customers but many analysts feel internal organizational issues often overshadow this goal.&lt;br /&gt;&lt;br /&gt;&quot;Never lose focus on your customers,&quot; Ovum's Bradshaw advised. &quot;Make things better for them, not worse, or they?ll vote with their feet. Customers are smart enough to know that notices beginning ?For your convenience?? usually mean the opposite!&quot;&lt;br /&gt;&lt;br /&gt;According to Gartner's Nelson, the customer should be the number one focus throughout developing any CRM strategy.&lt;br /&gt;&lt;br /&gt;&quot;Organize around the customer and not products or geographies,&quot; the analyst said. &quot;All the investment made in systems will not help you if you continue to be structured around products. Customers do not understand your firm's political structures. Don't make them.&quot;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tip 4: Eat the elephant in small pieces.&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Almost all of the experts agreed that having the right schedule should go a long way toward achieving more effective CRM.&lt;br /&gt;&lt;br /&gt;&quot;Take manageable, deliberate steps towards your ultimate goal of enterprise-wide CRM,&quot; offered Ovum's Bradshaw. &quot;Make it work in one area at a time, and make sure you have demonstrable results before moving on. If you can go for the ?easy wins? first, that?s great, but it isn?t always an available option. For example, getting to clean, consistent customer data is never easy, but it is vital.&quot;&lt;br /&gt;&lt;br /&gt;Many of the tipsters suggested taking a &quot;wave approach&quot; that looks at the big picture but deals with various elements as they come up.&lt;/p&gt;
&lt;p&gt;&quot;CRM is iterative,&quot; said Gartner's Nelson. &quot;It often takes a while to see the results. One of the easiest traps to fall into is 'once and done.' CRM is not like that. Firms should think in terms of iterative waves, where they learn more about the customer, and improve their sales, marketing and service abilities with each iteration.&quot;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tip 5: Keep it enterprise.&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;Another common theme among the analysts is the war cry to keep top-level executives involved in the CRM process from start to finish, and to emphasize that the enterprise approach cannot fall prey to issues of departmental independence.&lt;br /&gt;&lt;br /&gt;&quot;Too many executives don?t know what CRM is, and what it really means to their organization, before they start writing big checks,&quot; said Allen Bonde, president of the Boston-based Allen Bonde Group. &quot;Once they do [understand CRM], then you need to build consensus among business and IT managers on how to achieve it.&quot;&lt;br /&gt;&lt;br /&gt;Gartner's Nelson said he has seen the departmental approach cause failure in CRM projects more than once.&lt;br /&gt;&lt;br /&gt;&quot;CRM done at a department level sub-optimizes the customer relationship,&quot; he said. &quot;This is why CRM needs to be strategized at an enterprise level, even if a particular initiative is departmental in nature. Think strategically, invest tactically, and make sure everything fits into an enterprise wide strategy.&quot;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 19 Dec 2008 02:55:49 -0600</pubDate>
      <link>http://activerain.com/blogsview/844019/the-top-five-tips-for-crm-strategy</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/843983/seo-secret-that-everybody-ignores</guid>
      <title>SEO Secret That Everybody Ignores</title>
      <description>&lt;p&gt;Search engine optimization is a very critical task in the success of any website. In recent times search engines seem to have complicated this task further by their frequent changes in rules. This has caused lots of anxiety as some of these changes have seen some sites lose virtually all their regular traffic instantly as their rankings have tumbled.&lt;/p&gt;
&lt;p&gt;Are You Ignoring An Important SEO Step?&lt;br /&gt;This has further added to the confusion amongst webmasters about search engines and their motivations. But no matter how mad one gets at the search engines, there is little that they can do to change the statistics which clearly indicate that well over 75 per cent of the traffic that most sites receive comes directly from search engines.&lt;/p&gt;
&lt;p&gt;However, there is a secret that an increasing number of webmasters have discovered and are putting to good use. Whatever regular changes search engines instigate, their motivation remains the same. Most webmasters forget that there is currently very stiff competition between the leading search engines. More so because it has become abundantly clear that none of the top search engines are interested in the runners up position.&lt;/p&gt;
&lt;p&gt;The search engine motivation&lt;br /&gt;This stiff competition between search engines is focused on the customer, that is the person who uses search engines to find information online. The preferred search engine and therefore the top one will always be the one that most satisfies the needs of that customer.&lt;/p&gt;
&lt;p&gt;So whatever changes search engines make they will always be focused on improving the search engine experience for surfers. It is not too difficult to figure out what those who use search engines want, or even more important, what they do not want. Anybody using a search engine wants to be able to find what they are looking for quickly. Most will be looking for the most detailed quality content on the subject or information that they seek.&lt;/p&gt;
&lt;p&gt;This simply means that any website that places its' focus on the end consumer using search engines, rather than on the search engines themselves is guaranteed to retain their high ranking whatever changes search engines keep on making.&lt;/p&gt;
&lt;p&gt;Can you dare assume that search engines do not exist?&lt;br /&gt;So the most effective way to ensure that a website owner focuses on the surfers is for them to direct their focus on them, just like search engines are. It would help tremendously for them to start operating as if search engines did not exist and fully concentrate on ensuring that their sites have quality detailed content.&lt;/p&gt;
&lt;p&gt;The dangers and harmful effects of webmasters focusing on search engines are very evident. We have sites that use keywords so heavily that it affects the quality of writing on their sites. Some sites have even done worse, leaving their quality content intact but creating keywords that are hidden to the human eye but visible to search engines.&lt;/p&gt;
&lt;p&gt;In fact there are a whole lot of schemes on the net created to fool search engines, which have no regard for the surfer or what information they are looking for. For instance there is plenty of software on sale online whose promoters bluntly state is designed to fool search engines.&lt;/p&gt;
&lt;p&gt;These are the sort of schemes that are causing so many constant changes in the rules of leading search engines as they seek for ways to combat any tricks that would favor undeserving sites in their rankings.&lt;/p&gt;
&lt;p&gt;Content will always be king&lt;br /&gt;What all this means is that content remains the most important search engine optimization tool. Just take a closer look at all the leading google websites and try and trace a common thread running through them all. You will quickly discover that the quality of writing and content in these sites is extremely high.&lt;/p&gt;
&lt;p&gt;This means that any webmaster that makes an effort to provide quality, relevant and detailed content on their site with the experience of the surfer as their central focus will be using an extremely powerful search engine optimization tool.&lt;/p&gt;
&lt;p&gt;If your content is good, then you can post it at other websites complete with a detailed resource box that directs traffic to your website. Quality articles will usually end up being re-posted all over the net in an endless viral effect that will create quality links back to your site. Search engines still rank websites based on the links from other sites leading to it so quality content has a double advantage.&lt;/p&gt;
&lt;p&gt;You will of course need to do a thorough job of posting your articles to various article directories, ezine publishers and announcement lists.&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 19 Dec 2008 01:11:34 -0600</pubDate>
      <link>http://activerain.com/blogsview/843983/seo-secret-that-everybody-ignores</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/843937/extract-data-from-web-site</guid>
      <title>Extract Data From Web Site</title>
      <description>&lt;div class=&quot;post-body&quot;&gt;
&lt;p&gt;Here is an example:&lt;br /&gt;&lt;br /&gt;&quot;Hi,&lt;br /&gt;&lt;br /&gt;I'd like to collect all data of pharmacies from a Hungarian website www(dot)dr(dot)info(dot)hu into excel sheet (please replace (dot) with .). You can find the list under the headword &quot;gyogyszertarak&quot; (third button on the center of the site):&lt;br /&gt;&lt;br /&gt;1. Name&lt;br /&gt;2. Address&lt;br /&gt;3. City&lt;br /&gt;4. Phone&lt;br /&gt;5. Type&lt;br /&gt;&lt;br /&gt;You can see the site in action if you choose 'Budapest' dropdown and click on the 'KERESES' button. All pharmacies from Budapest appears. I need a complete list of all pharmacies in all cities (all database) in Excel sheet.&lt;br /&gt;&lt;br /&gt;The data should be collected(&lt;a href=&quot;http://www.knowlesys.com/web/web_screen_scraper.htm&quot;&gt;Web Screen Scraper&lt;/a&gt;) in a manner that I can sort pharmacies by each field (eg. City, name&amp;hellip;). &quot;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Web2DB&amp;nbsp; (&lt;a href=&quot;http://www.knowlesys.com/web/web_screen_scraping.htm&quot;&gt;Web Screen Scraping&lt;/a&gt;) service could provide you custom-designed Web Data Extraction Software in a very short time. Contact us now!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Screenshots of Examples&lt;/strong&gt;:&lt;br /&gt;&amp;nbsp;    &lt;a href=&quot;http://www.knowlesys.com/web/web_search_and_data_mining.htm&quot;&gt;Web Search And Data Mining&lt;/a&gt;&lt;br /&gt;&amp;nbsp;    &lt;a href=&quot;http://www.knowlesys.com/web/web_search_data_mining.htm&quot;&gt;Web Search Data Mining&lt;/a&gt;&lt;br /&gt;&amp;nbsp;    &lt;a href=&quot;http://www.knowlesys.com/web/web_service_based_data_mining.htm&quot;&gt;Web Service Based Data Mining&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Thu, 18 Dec 2008 23:55:19 -0600</pubDate>
      <link>http://activerain.com/blogsview/843937/extract-data-from-web-site</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/842157/deep-web-data-extraction</guid>
      <title>Deep Web Data Extraction</title>
      <description>&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt; &lt;br /&gt; The unabated growth of the Web has resulted in a situation in which                more information is available to more people than ever in human                history. Along with this unprecedented growth has come the inevitable                problem of information overload. To counteract this information                overload, users typically rely on search engines (like Google and                AllTheWeb) or on manually-created categorization hierarchies (like                Yahoo! and the Open Directory Project). Though excellent for accessing                Web pages on the so-called &quot;crawlable&quot; web, these approaches                overlook a much more massive and high-quality resource: the Deep                Web.&lt;/p&gt;
&lt;p&gt;The Deep Web (or Hidden Web) comprises all information that resides                in autonomous databases behind portals and information providers'                web front-ends. Web pages in the Deep Web are dynamically-generated                in response to a query through a web site's search form and often                contain rich content. A recent study has estimated the size of the                Deep Web to be more than 500 billion pages, whereas the size of                the &quot;crawlable&quot; web is only 1% of the Deep Web (i.e.,                less than 5 billion pages). Even those web sites with some static                links that are &quot;crawlable&quot; by a search engine often have                much more information available only through a query interface.                Unlocking this vast deep web content presents a major research challenge.&lt;/p&gt;
&lt;p&gt;In analogy to search engines over the &quot;crawlable&quot; web,                we argue that one way to unlock the Deep Web is to employ a fully                automated approach to extracting, indexing, and searching the query-related                information-rich regions from dynamic web pages. For this miniproject,                we focus on the first of these: extracting data from the Deep Web.&lt;/p&gt;
&lt;p&gt;Extracting the interesting information from a Deep Web site requires                many things: including scalable and robust methods for analyzing                dynamic web pages of a given web site, discovering and locating                the query-related information-rich content regions, and extracting                itemized objects within each region. By full automation, we mean                that the extraction algorithms should be designed independently                of the presentation features or specific content of the web pages,                such as the specific ways in which the query-related information                is laid out or the specific locations where the navigational links                and advertisement information are placed in the web pages.&lt;/p&gt;
&lt;p&gt;There are many possible 7001-miniprojects. Feel free to talk to                either of us for more details. Here are a few possibilities to consider:&lt;/p&gt;
&lt;p&gt;1. Develop a Web-based demo for clustering pages of a similar                type from a single Deep Web source. For example, AllMusic produces                three types of pages in response to a user query: a direct match                page (e.g. for Elvis Presley), a list of links to match pages (e.g.                a list of all artists named Jackson), and a page with no matches.                As a first-step to extracting the relevant data from each page,                you may develop techniques to separate out the pages that contain                query matches from pages that contain no matches, and perhaps, rank                each group based on some metric of quality.&lt;/p&gt;
&lt;p&gt;2. Design a system for extracting interesting data from a collection                of pages from a Deep Web source. You might define a set of regular                expression that can identify dates, prices, or names. Develop a                small program that converts a page into a type structure. For example,                given a DOM model of a web page, identify all of the types that                you have defined, and replace the string tokens with XML tags identifying                the types. Replace all non-type tokens with a generic type, and                return the tree as a full type structure). Alternatively, you may                suggest your own approach for extracting data.&lt;/p&gt;
&lt;p&gt;3. Develop a system to recognize names in page. Given a list of                names and a web page, identify possible matches in the page. Based                on the structure of the page and the distribution of recognized                names, identify strings that may also be names based on their location                in the DOM tree heirarchy representing the page.&lt;/p&gt;
&lt;p&gt;4. Write a survey paper about current approaches for understanding                and analyzing the Deep Web. Be sure to include many of your own                comments on the viability of the approaches you review.&lt;/p&gt;
&lt;p&gt;5. Or, feel free to suggest a miniproject of your own.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Background:&lt;/strong&gt; Knowledge of Java or Python would                be helpful. Some knowledge of information retrieval and machine                learning may be useful but is not required.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Deliverables:&lt;/strong&gt; You should submit a report that                clearly describes what you have learned and what you have accomplished.                The report should include useful references. You should also provide                any source code you may have written to validate your ideas.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Evaluation:&lt;/strong&gt; You will be graded on the novelty                                        and quality of your report and implementation.&lt;/p&gt;
&lt;p&gt;......&lt;/p&gt;
&lt;p&gt;more information:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/collect_business_directory_data.htm&quot;&gt;Collect Business Directory Data&lt;/a&gt;&lt;br /&gt; &lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/collect_product_information.htm&quot;&gt;Collect Product Information&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/collect_website_data.htm&quot;&gt;Collect Website Data&lt;/a&gt;&lt;br /&gt; &lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/custom_web_crawler.htm&quot;&gt;Custom Web Crawler&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/web_data_crawler.htm&quot;&gt;Web Data Crawler&lt;/a&gt;&lt;br /&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/sex_offender_database.htm&quot;&gt;Sex Offender Database&lt;/a&gt;&lt;br /&gt; &lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/web_screen_scraping.htm&quot;&gt;Web Screen Scraping&lt;/a&gt;&lt;br /&gt; &lt;a href=&quot;http://www.knowlesys.com/articles/data-extraction/marketing_data.htm&quot;&gt;Collect Marketing Data&lt;/a&gt;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Thu, 18 Dec 2008 04:39:15 -0600</pubDate>
      <link>http://activerain.com/blogsview/842157/deep-web-data-extraction</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/838319/marketing-data</guid>
      <title>Marketing Data</title>
      <description>&lt;p&gt;Looking for the best way to finish web screen scraping within             twenty-four hours? Web scraping or harvesting is, technically, any of             the various methods by which one can extract content from a website             over HTTP. This content is almost always changed into another format             for use in another context, such as marketing. In this brief article,             we&amp;rsquo;ll take a look at how you can most efficiently scrape web data, as             well as the legal issues and technical scripting that may pose a             problem to web scrapers.&lt;/p&gt;
&lt;p&gt;The most common form of web screen scraping is the web crawler, used             by such sites as Google. The most commonly seen use for web scraping is             the scraper site, a website in which none of the content is original,             and all information is taken from existing websites. The best way to             scrape data is with one of the many online programs, which generally             range from personal to corporate. Personal data scraping programs can             be free or cheap, while corporation-grade scrapers can run upwards of             thousands of dollars. Scrapers basically work by going over a website             and collecting relevant data from any number of fields, be it simple             text or e-mail addresses and phone and fax information.&lt;/p&gt;
&lt;p&gt;Common legal issues with web screen scraping are invasion of privacy             and violation of terms of use. Certain publication licenses like             Creative Commons allow reproduction of material, and a recent lawsuit             ruled that reproduction of facts was not a legal violation, but the web             scraper must be careful what he or she chooses to reproduce. Gathering             personal information like phone and fax data and e-mail addresses can             be an invasion of privacy if the user is not informed, or if the             information is improperly used, so some sort of agreement must be made             by the user upon collection, otherwise serious legal action may, in             some cases, be taken by the user.&lt;/p&gt;
&lt;p&gt;There are certain ways to avoid web screen scraping, of which anyone             who wants to scrape should be aware. Some sites will block scrapers&amp;rsquo; IP             addresses and some will have entries in robots.txt. Some sites will             block bots based on what they declare themselves to be (though             poorly-behaved crawler robots might list themselves as actual users).             Excess traffic monitoring and verification programs can also block             crawlers. Being aware of these obstacles and having a legitimate way to             overcome them is very helpful to anyone trying to scrape information.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Tue, 16 Dec 2008 00:45:08 -0600</pubDate>
      <link>http://activerain.com/blogsview/838319/marketing-data</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/838311/web-screen-scraping</guid>
      <title>Web Screen Scraping</title>
      <description>&lt;p&gt;Looking for the best way to finish web screen scraping within twenty-four hours? Web scraping or harvesting is, technically, any of the various methods by which one can extract content from a website over HTTP. This content is almost always changed into another format for use in another context, such as marketing. In this brief article, we&amp;rsquo;ll take a look at how you can most efficiently scrape web data, as well as the legal issues and technical scripting that may pose a problem to web scrapers.&lt;/p&gt;
&lt;p&gt;The most common form of web screen scraping is the web crawler, used by such sites as Google. The most commonly seen use for web scraping is the scraper site, a website in which none of the content is original, and all information is taken from existing websites. The best way to scrape data is with one of the many online programs, which generally range from personal to corporate. Personal data scraping programs can be free or cheap, while corporation-grade scrapers can run upwards of thousands of dollars. Scrapers basically work by going over a website and collecting relevant data from any number of fields, be it simple text or e-mail addresses and phone and fax information.&lt;/p&gt;
&lt;p&gt;Common legal issues with web screen scraping are invasion of privacy and violation of terms of use. Certain publication licenses like Creative Commons allow reproduction of material, and a recent lawsuit ruled that reproduction of facts was not a legal violation, but the web scraper must be careful what he or she chooses to reproduce. Gathering personal information like phone and fax data and e-mail addresses can be an invasion of privacy if the user is not informed, or if the information is improperly used, so some sort of agreement must be made by the user upon collection, otherwise serious legal action may, in some cases, be taken by the user.&lt;/p&gt;
&lt;p&gt;There are certain ways to avoid web screen scraping, of which anyone who wants to scrape should be aware. Some sites will block scrapers&amp;rsquo; IP addresses and some will have entries in robots.txt. Some sites will block bots based on what they declare themselves to be (though poorly-behaved crawler robots might list themselves as actual users). Excess traffic monitoring and verification programs can also block crawlers. Being aware of these obstacles and having a legitimate way to overcome them is very helpful to anyone trying to scrape information.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Tue, 16 Dec 2008 00:33:43 -0600</pubDate>
      <link>http://activerain.com/blogsview/838311/web-screen-scraping</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/838297/sex-offender-database</guid>
      <title>Sex Offender Database</title>
      <description>&lt;p&gt;Security is one of the greatest concerns of anyone with a family, who runs a business, or who rents space to other people. One of the best ways to increase security is through a visitor management system. These systems can provide a range of security measures, from excluding those without access to alarm systems. In this brief article, we&amp;rsquo;ll go over how including a connection to a sex offender database can improve the security of your visitor management system, as well as some useful tips when considering how to use your system to keep yourself or others safe and secure.&lt;/p&gt;
&lt;p&gt;Visitor management systems have the obvious benefit of keeping out anyone who doesn&amp;rsquo;t meet the requirements you specify to enter. They have the added bonus, however, of having those you allow enter provide information on themselves. For this reason, having access to a sex offender database allows you to not necessarily exclude, in the case of something like a housing development, but be aware of the status of anyone who wants to gain entry. All you need to connect your visitor management system to a sex offender database is a wireless or even routed internet connection &amp;ndash; after that, it&amp;rsquo;s a matter of having someone install a self-updating program to keep your system in touch.&lt;/p&gt;
&lt;p&gt;An easy way to have your visitor management system stay current with the online sex offender database is through a web crawler or spider. These electronic data packages can be set to scrape certain sites for information, be it visual (as in maps and charts) or textual (bulk data and indices). Having a spider set to ping the offender database, say, once a week should be plenty. Any professional in the field could set up a simple auto update program and have your visitor management system stay effective and updated with zero to no effort on your part.&lt;/p&gt;
&lt;p&gt;Some things to keep in mind while considering the sex offender database as part of your visitor management program are your proximity to schools and parks, which restrict the activities of registered sex offenders within a certain distance, and of course the action you might take upon receiving information that someone who wants to gain entry is a registered offender. There might even be some merit to having the system automatically dial emergency numbers if your system is close enough to a park or school for the offender to be in violation of that distance.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Tue, 16 Dec 2008 00:25:31 -0600</pubDate>
      <link>http://activerain.com/blogsview/838297/sex-offender-database</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/838290/web-data-crawler</guid>
      <title>Web Data Crawler</title>
      <description>&lt;p&gt;Building your own web data crawler is a great way to get very specific information in whatever fields you choose, but can be trickier than most people think. In this brief article, we&amp;rsquo;ll go over some easy tips and tricks to keep in mind while constructing a spider, but first we&amp;rsquo;ll take a look at some basic information on crawlers. A web crawler is, essentially, any package of code that is designed to browse the web in a specific pattern. They can be used for data collection, website maintenance (through checking links and looking at images), search engine indexing, and much more. They are the most common type of web scraping tool, and can be used for a variety of purposes.&lt;/p&gt;
&lt;p&gt;The basic web data crawler is a very simple bundle of code that is designed to jump from link to link, occasionally copying up text or other data that meets certain parameters. Depending on what you intend to use your crawler for, you&amp;rsquo;ll need to adjust how it behaves. For example, say you are building a spider to collect data on a certain demographic, in this case, online auction traders. You would probably want to include sites in its path like eBay, and set it to gather information on what goods are most commonly auctioned, pricing for different types of goods, etc. Conversely, a spider sent to test links on a personal website and check for errors in code will act completely differently. It is important to keep in mind what your personal purpose for your spider is.&lt;/p&gt;
&lt;p&gt;Remember, a custom web data crawler can behave well or poorly, based on how you code it to respond to certain things. A well-behaved spider will obey commands in files like robots.txt, which dictates how automated crawlers are to respond to certain things. A well-behaved spider will announce itself and what it is, and for whom it is crawling. The benefits to having a well-behaved crawler are fairly obvious &amp;ndash; you won&amp;rsquo;t receive complaints from webmasters who catch you crawling where you aren&amp;rsquo;t supposed to, and some serious lawsuits can result by coding a spider that ignores attempts to keep it out.&lt;/p&gt;
&lt;p&gt;Having a web data crawler at your disposal can be a valuable resource, but it must be used correctly. As long as your crawler is respectful and obedient to webmasters&amp;rsquo; commands, you&amp;rsquo;ll be collecting data without a hitch in no time at all.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Tue, 16 Dec 2008 00:16:38 -0600</pubDate>
      <link>http://activerain.com/blogsview/838290/web-data-crawler</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/836602/custom-web-crawler</guid>
      <title>Custom Web Crawler</title>
      <description>&lt;p&gt;The most popular way to collect data on the web, by far, is the web crawler. Having a custom web crawler to seek out and compile information you specify can be immensely useful to anyone who deals with large amounts of data &amp;ndash; be you an attorney, a scientist, or an advertiser. A web crawler (also known as a web spider or web robot) is, basically, any program or automatic script that scours the web in a set pattern. These code packages can be invaluable at recovering data for a variety of purposes. In this article, we&amp;rsquo;ll take a look at the most common ways web crawlers are used, how you can customize a web crawler, and some tips to keep in mind when creating yours.&lt;/p&gt;
&lt;p&gt;Web crawlers are gatherers of information, and internet is the biggest depository of information in the world. Therefore, it makes sense that the most common browser of the internet is not people, but spiders. Spiders are used to keep search engines up to date, to discover and index new pages, to rank search results, scraping web pages, and for website maintenance (by checking links and looking at images). Web crawlers can be of use to anyone who frequently uses the internet to gather similar information, who wants to keep updated on a certain site, or who wants to maintain their own website. Anyone, essentially, that has a large amount of data to deal with and doesn&amp;rsquo;t want to sift through it by hand can benefit through the use of a custom web crawler.&lt;/p&gt;
&lt;p&gt;Coding a custom crawler is probably beyond most people&amp;rsquo;s programming skills, so a number of companies have cropped up that provide various methods of web data extraction. The most popular of these is the custom web crawler, which can be specified to extract certain types of data and can be programmed to visit certain sites or even certain kinds of sites. It works by collecting data, both static and dynamic, from websites. It then converts this data into a readable format, and can perform simple editing functions like the removal of repeat material.&lt;/p&gt;
&lt;p&gt;Important things to keep in mind when using a custom web crawler, or any form of online data collection, are the behavior of your crawler and terms of use you may violate. A well-behaved crawler will announce what it is and follow instructions in robots.txt, a file through which websites can control how crawlers behave.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Mon, 15 Dec 2008 04:32:04 -0600</pubDate>
      <link>http://activerain.com/blogsview/836602/custom-web-crawler</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/836588/collect-product-information</guid>
      <title>Collect Product Information</title>
      <description>&lt;p&gt;The most common form of data collection on the web is the web crawler, also known as a web spider or web robot. The web crawler is, essentially, a package of code that browses through web sites in a methodical manner, often with a predetermined set of instructions from the user. They can be useful for a variety of things, from indexing sites for a search engine to gathering information for marketing. In this case, we&amp;rsquo;ll take a look at how they can be used to collect product information, and how that information can be used for your business&amp;rsquo;s benefit. &lt;a href=&quot;http://www.knowlesys.com/products/custom_web_data_extractor.htm&quot;&gt;Custom Web Data Grabber&lt;/a&gt;&lt;br /&gt; Crawlers work by hopping from site to site across available links. They usually pick up information as they go, depending on what the user has specified for them to do. A common function for web crawlers is picking up client data, such as e-mail addresses and phone numbers, either for lead generation or marketing purposes. They can also be used to maintain one&amp;rsquo;s own website, by accessing and testing links and images and fixing broken ones, all automatically.&lt;a href=&quot;http://www.knowlesys.com/products/custom_web_data_grabber.htm&quot;&gt;Custom Web Data Extractor&lt;/a&gt;The user can specify which type or field of information he or she wishes the spider to collect, and what sort of web sites he or she wants it to browse for. Setting a spider to collect product information is an easy way to get a leg up on the competition.&lt;/p&gt;
&lt;p&gt;If you send your spider out to collect product information, either from a competitor&amp;rsquo;s website or to set prices, it will then compile that information into a readable format for you. This allows you to easily create spreadsheets and graphs, balance prices and research, and even get information on how competitor&amp;rsquo;s websites and support networks operate. A well-behaved spider will announce itself as it crawls, and certain websites might want to block your spider in a variety of ways.&lt;/p&gt;
&lt;p&gt;This can be tricky. It is difficult to ignore a website&amp;rsquo;s instructions in, say, robots.txt (a generic file used to give commands and restrictions to automated crawlers) &lt;a href=&quot;http://www.knowlesys.com/products/custom_web_data_spider.htm&quot;&gt;Custom Web Data Spider&lt;/a&gt; without violating a terms of use or privacy agreement. Therefore, it is best to have permission in the form of an agreement or licensing act before crawling a site that doesn&amp;rsquo;t necessarily want you there. Having a spider to collect product information can be immensely useful, but one must always remember while using it that one is responsible for the spider even though it is an automated, independent entity.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Mon, 15 Dec 2008 02:48:03 -0600</pubDate>
      <link>http://activerain.com/blogsview/836588/collect-product-information</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/836586/collect-website-data</guid>
      <title>Collect Website Data</title>
      <description>&lt;p&gt;Whether it&amp;rsquo;s for personal use or internet marketing, anyone who needs to collect website data wants to do it quickly and efficiently. There are several ways to do this, but some come more highly recommended than others. In this brief article, we&amp;rsquo;ll take a look at some of the ways to extract data from websites, as well as what that data is typically used for and some things to beware.&lt;/p&gt;
&lt;p&gt;There are three kinds of web data mining that need to be considered, all of which are different ways to collect website data. Web usage mining, web content mining, and web structure mining compose the three major types of commonly used web mining. The first consists of analyzing user search and browsing patterns, the second is the analysis of actual content on the web, and the last is the analysis of how websites are constructed. These are used to either create user profiles for marketing purposes, or to find patterns in web browsing and development in order to be more efficient in tools for those purposes.&lt;/p&gt;
&lt;p&gt;Probably the easiest way to collect website data in bulk is through the use of a web data extractor. These handy tools allow the user to sift through a website, filtering out the information they want. They can generally filter out meta tags, e-mails, phone and fax numbers, URLs and body text. The better extractors work automatically, and compile indices of desired sites in a relatively short amount of time. While this data can be used for a number of things, this type of data is probably best suited to internet marketing or information trading.&lt;/p&gt;
&lt;p&gt;Web data collection poses an obvious threat to anyone who becomes a sample&amp;rsquo;s privacy. A technical invasion of privacy occurs in data collection when the individual&amp;rsquo;s information is obtained, used, or given out, especially when the user has no knowledge or consent of this. Therefore, one must be careful of how one gathers and distributes information gained in online data collection. The best thing to do to collect website data is to do so with the express permission of whatever source is in possession of that information, to avoid any misunderstandings or breaches of privacy. Privacy policies can also be created to have the user agree to have their information collected and used for certain purposes, to allow their information to be used in, say, advertising but not direct marketing.&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Mon, 15 Dec 2008 02:45:23 -0600</pubDate>
      <link>http://activerain.com/blogsview/836586/collect-website-data</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/800686/what-is-web-page-scraping-</guid>
      <title>what is Web Page Scraping?</title>
      <description>&lt;p&gt;Web2DB Data Service (Web Page Scraping ) is most convenient way to Scraping data from web pages in a short time. STOP wasting your time on manual COPY / PASTE work. We can deliver your desired data quickly, just the format as you want since we do the web data extraction jobs on our Blue Whale Web Data Extraction System everyday. &lt;br /&gt; &lt;br /&gt; &lt;br /&gt; &lt;br /&gt;What we do? - We Scraping content from web pages on your targeted website and convert the raw data to structured records in rational database. We guarantee the quality of the results via our standard service process guide. &lt;br /&gt; &lt;br /&gt;What you get? &amp;ndash; Accurate, fast result as you want. The file format could be Excel, Access, CSV, Text, MS SQL, and My SQL etc. &lt;br /&gt; &lt;br /&gt;How it works? - Our softwares are designed for Web Page Scraping &amp;nbsp;from both static and dynamic web pages. They were used to help us to analyze the website, extract the data, process the data etc. You will receive the progress message everyday when we are working on your project until you receive the preview of the final data. &lt;br /&gt;Benefits &lt;br /&gt; &lt;br /&gt;Low cost -- It will save your hundreds of thousands of man-hours and dollars! Even our competitors outsource their projects to us. &lt;br /&gt; &lt;br /&gt; &lt;br /&gt;Accurate Results -- Our system will help you get the most accurate results that cannot be collected by human beings. So that you can generate sales leads, harvest product pricing data, duplicate an online database, capture financial data,real estate data, job postings, auction info and more easily and happily. &lt;br /&gt; &lt;br /&gt;Fast -- For a job costing 20 human days, We can finish a job in only 2-4 hours. So that you can save your time, labor, and money in your business and get an obvious time-to-market advantage over your competitors. &lt;br /&gt; &lt;br /&gt; &lt;br /&gt;More information,please vist: &lt;a href=&quot;http://www.knowlesys.com/&quot; target=&quot;_blank&quot;&gt;http://www.knowlesys.com/&lt;/a&gt; &lt;br /&gt;Data Collection Examples&#65306;&lt;a href=&quot;http://www.knowlesys.com/examples.htm&quot; target=&quot;_blank&quot;&gt;http://www.knowlesys.com/examples.htm&lt;/a&gt;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 21 Nov 2008 03:06:45 -0600</pubDate>
      <link>http://activerain.com/blogsview/800686/what-is-web-page-scraping-</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/800683/knowlesys-data-mining-web-mining-and-knowledge-discovery</guid>
      <title>Knowlesys: Data Mining, Web Mining, and Knowledge Discovery</title>
      <description>&lt;div style=&quot;padding: 10px;&quot;&gt;
&lt;p&gt;Web Data Mining is the process to extract data from target websites to local database for further processing or use.(&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_mining_applications.htm&quot; rel=&quot;nofollow&quot;&gt;Knowlesys Web Data Mining Applications&lt;/a&gt;)&lt;br /&gt; &lt;br /&gt;You need data before Data Mining, Web Mining, Analytics, and Knowledge Discovery. &lt;br /&gt; &lt;br /&gt;Sometime the data is on remote websites so you need do web data extraction first.&lt;br /&gt; &lt;br /&gt;&lt;br /&gt; &lt;br /&gt;Knowlesys provide web data extraction data service and custom-designed software with low cost.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;more information&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;nbsp;&amp;nbsp; &lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_entry_automation.htm&quot; rel=&quot;nofollow&quot;&gt;Knowlesys Data Entry Automation Service&lt;/a&gt;&lt;br /&gt;&lt;br /&gt; &lt;br /&gt;&amp;nbsp; &amp;nbsp; &lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_export.htm&quot; rel=&quot;nofollow&quot;&gt;Knowlesys Custom-Designed Data Export Tool &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br /&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_extracting_tool.htm&quot; rel=&quot;nofollow&quot;&gt;&amp;nbsp; Custom-Designed Data Extractor - Data Extraction Tool From Knowlesys&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;br /&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_extraction_application.htm&quot; rel=&quot;nofollow&quot;&gt;&amp;nbsp;&amp;nbsp; Are You Looking for a Data Extraction Application?&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/div&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Fri, 21 Nov 2008 02:49:31 -0600</pubDate>
      <link>http://activerain.com/blogsview/800683/knowlesys-data-mining-web-mining-and-knowledge-discovery</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/798544/screen-scraping-data-from-web-pages</guid>
      <title>Screen Scraping Data  from Web Pages</title>
      <description>&lt;p&gt;Keywords:&amp;nbsp;&lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Screen Scraping Data&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt; ,Automatic Data Extraction,Screen Scrapping,Web Data &lt;/strong&gt;&lt;strong&gt;&lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Scraping&lt;/strong&gt;&lt;/a&gt;&lt;/strong&gt;&lt;strong&gt;,Web Page &lt;/strong&gt;&lt;strong&gt;&lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Scraping&lt;/strong&gt;&lt;/a&gt;&lt;/strong&gt;&lt;br /&gt; &amp;nbsp;&lt;br /&gt; &amp;nbsp;&amp;nbsp; Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its book pages. The values used to generate the pages (e.g., the author, title,...) typically come from a database.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; In this paper, we study the problem of automatically &lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Screen Scraping Data&lt;/strong&gt;&lt;/a&gt; the database values from the web pages without any learning examples or other similar human input. We formally define the notion of a template, and propose a model that describes how values are encoded into pages using a template. We present an extraction algorithm that uses sets of words that have similar occurrence pattern in the input pages, to construct the template. The constructed template is then used to extract values from the pages. We show experimentally that the extracted values make semantic sense in most cases.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;More &lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;&lt;strong&gt;Screen Scraping Data&lt;/strong&gt;&lt;/a&gt; information,please vist: &lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;http://www.knowlesys.com/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;Screen Scraping&lt;/a&gt; Examples&#65306;&lt;a href=&quot;http://www.knowlesys.com/examples.htm&quot; rel=&quot;nofollow&quot;&gt;http://www.knowlesys.com/examples.htm&lt;/a&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 19 Nov 2008 21:49:22 -0600</pubDate>
      <link>http://activerain.com/blogsview/798544/screen-scraping-data-from-web-pages</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/798384/why-screen-scraping-service-</guid>
      <title>Why Screen Scraping service? </title>
      <description>&lt;h2&gt;&lt;span style=&quot;font-size: large;&quot;&gt;Without Screen Scraping tools&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt; &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Tools are needed to manage all available information including                the Web, subscription services, and internal data stores. Without                an Screen Scraping tool (a product specifically designed to find, organize,                and output the data you want), you have very poor choices for getting                information. Your choices are:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Use search engines&lt;/strong&gt; Search engines help find some                Web information, but they do not pinpoint information, cannot fill                out web forms they encounter to get you the information you need,                are perpetually behind in indexing content, and at best, can only                go two or three levels deep into a Web site. And they cannot search                file directories on your network.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Manually surf the Web and file directories&lt;/strong&gt; Aside                from the labor-intensive aspect of this option, the work is tedious,                costly, error prone, and very time consuming. Humans have to read                the content of each page to see if it matches their criteria, whereas                a computer is simply matching patterns, which is so much faster.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Create custom programming&lt;/strong&gt; Custom programming                is costly, can be buggy, requires maintenance, and takes time to                develop. Plus the programs must be constantly updated as the location                of information frequently changes.&lt;/p&gt;
&lt;p&gt;Inefficient methods means the information analyst spends time finding,                collecting, and aggregating data instead of analyzing data and gaining                the competitive edge. This also affects the application programmer                who has to spend time developing extraction tools instead of developing                tools for the core business.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;More&amp;nbsp; Screen Scraping information,please vist: &lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;http://www.knowlesys.com/&lt;/a&gt;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 19 Nov 2008 20:12:17 -0600</pubDate>
      <link>http://activerain.com/blogsview/798384/why-screen-scraping-service-</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/798364/net-screen-scraping</guid>
      <title>Net Screen Scraping</title>
      <description>&lt;p&gt;Web2DB Data Service (&lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;Net Screen Scraping&lt;/a&gt;) is most convenient way to Scraping                  data from web pages in a short time. STOP wasting your time on                  manual COPY / PASTE work. We can deliver your desired data quickly,                  just the format as you want since we do the web data extraction                  jobs on our&lt;strong&gt; &lt;/strong&gt;Blue Whale Web Data Extraction System                  everyday.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;color: #666666;&quot;&gt;&lt;strong&gt;What we do?&lt;/strong&gt;&lt;/span&gt; -                    We &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;Scraping&lt;/a&gt; content from web pages on your targeted website and                    convert the raw data to structured records in rational database.                    We guarantee the quality of the results via our standard service                    process guide. &lt;br /&gt; &lt;br /&gt; &lt;span style=&quot;color: #666666;&quot;&gt;&lt;strong&gt;What you get?&lt;/strong&gt;&lt;/span&gt; &amp;ndash; Accurate, fast result as you want. The file format could be                    Excel, Access, CSV, Text, MS SQL, and My SQL etc. &lt;br /&gt; &lt;br /&gt; &lt;span style=&quot;color: #666666;&quot;&gt;&lt;strong&gt;How it works? &lt;/strong&gt;&lt;/span&gt;-                    Our softwares are designed for &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;Net Screen Scraping&lt;/a&gt; from both static                    and dynamic web pages. They were used to help us to analyze                    the website, extract the data, process the data etc. You will                    receive the progress message everyday when we are working on                    your project until you receive the preview of the final data.&lt;/p&gt;
&lt;table cellspacing=&quot;0&quot; border=&quot;0&quot; cellpadding=&quot;10&quot; width=&quot;100%&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;hd0&quot; colspan=&quot;2&quot;&gt;Benefits                 &amp;lt;!--Web Data Extraction,Screen Scraping,HTML Extraction --&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;span style=&quot;color: #666666;&quot;&gt;&lt;strong&gt;Low                    cost &lt;/strong&gt;&lt;/span&gt;-- It will save your hundreds of thousands                    of man-hours and dollars! Even our competitors outsource their                    projects to us.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span style=&quot;color: #666666;&quot;&gt;Accurate Results &lt;/span&gt;&lt;/strong&gt;--                    Our system will help you get the most accurate results that                    cannot be collected by human beings. So that you can generate                    sales leads, harvest product pricing data, duplicate an online                    database, capture financial data,real estate data, job postings,                    auction info and more easily and happily.&lt;br /&gt; &lt;br /&gt; &lt;strong&gt;&lt;span style=&quot;color: #666666;&quot;&gt;Fast -- &lt;/span&gt;&lt;/strong&gt; For a                    job costing 20 human days, We can finish a job in only 2-4 hours.                    So that you can save your time, labor, and money in your business                    and get an obvious time-to-market advantage over your competitors.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 19 Nov 2008 19:59:54 -0600</pubDate>
      <link>http://activerain.com/blogsview/798364/net-screen-scraping</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/798349/how-to-mine-gold-in-that-mountain-of-web-data-</guid>
      <title>How to mine Gold in that Mountain of Web Data? </title>
      <description>&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As the biggest resource thesaurus in the world,                                  the Internet contains an almost unlimited quantity                                  of information, and the number of web pages in                                  it has already exceeded 100 billion, wherein a                                  world of many are useful for you. However, as                                  the key information exists in the great many of                                  HTML pages and in semi-structural form, furthermore,                                  many valuable information stay in the dynamic                                  pages produced by the database technology, it                                  is very difficult even impossible for you to use                                  the data in your ways again.&lt;/p&gt;
&lt;p&gt;&lt;br /&gt; &lt;strong&gt;Solution&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;First KnowleSys creates, maintains, and runs                                  Web Robots that extract data from the Web on our                                  BlueWhale platform.&lt;/p&gt;
&lt;p&gt;Then KnowleSys database experts manipulate these                                  data -- Transform, Cleansing, Filter, and Integration                                  -- to produce your desired database for processing                                  and analysis.&lt;/p&gt;
&lt;p&gt;Finally KnowleSys deliver it to you in a format                                  such as Access format or Excel spreadsheet.&lt;/p&gt;
&lt;p&gt;Also KnowleSys can develope special software                                  for you and you may run it in your house at any                                  time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Benefit&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You can harvest the gold for your business in                                  that Mountain of Web Data with low-cost! Your                                  desired database will reach your desktop in severial                                  days. &lt;br /&gt; &lt;br /&gt; You do not need to browse the web pages one by                                  one and Copy&amp;amp;Paste again and again.You do                                  not need to concern about the data format. You                                  do not need to spend your precious time to learn                                  any thing. &lt;br /&gt; &lt;br /&gt; Using our service will save your hundreds of thousands                                  of man-hours and dollars and may realize a substantial                                  10-1000 times return against your cost!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Price &lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As low as $0.02 - $0.001 per record. The more                                  the cheaper.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Time&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As short as 1-3 days or 2-3 weeks.&amp;nbsp;It depends                                  the size of your project.&lt;br /&gt; You will recieve the progress message of your                                  project via email veryday or every two days.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Case Study&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There is real data on the Internet, including                                  addresses,&lt;br /&gt; phone numbers, email addresses, prices, company                                  listings, contact listing, product listings, job                                  listings.&lt;/p&gt;
&lt;p&gt;Directory publishers or e-Shop owners feed their                                  database from the Web by using KnowleSys Web2DB                                  service to get the Access database directly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Industries&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We provide services or custom software to clients                                  across all industries. Some of the industries                                  that we have provided services and custom software                                  for are:&lt;br /&gt; &lt;br /&gt; Consultant Marketing/Research &lt;br /&gt; Healthcare Retail &lt;br /&gt; Defense Manufacturing&lt;br /&gt; Software Travel &lt;br /&gt; Energy Real Estate&lt;br /&gt; Financial Aerospace&lt;br /&gt; &lt;br /&gt; &lt;strong&gt;Summary&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;KnowleSys Web2DB Service is a low-cost way to                                  extract critical business data&lt;br /&gt; from web sites such as contact information(company                                  name, phone numbers, e-mail addresses, address,                                  and hyper-links), product information(product                                  number, product name, price, stock, description,                                  picture) etc. Their service is a cool tool for                                  your business.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Background&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Providing services for unstructured-information                                  management is an estimated $6.46 billion market                                  this year and a $9.72 billion industry by 2006,                                  according to research from IDC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt; Contact &amp;amp; Action&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For more information please visit &lt;a href=&quot;http://www.knowlesys.com/&quot;&gt;http://www.knowlesys.com&lt;/a&gt; .&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 19 Nov 2008 19:50:47 -0600</pubDate>
      <link>http://activerain.com/blogsview/798349/how-to-mine-gold-in-that-mountain-of-web-data-</link>
    </item>
    <item>
      <guid>http://activerain.com/blogsview/798313/web-scraping-software</guid>
      <title>Web Scraping Software</title>
      <description>&lt;p&gt;&amp;nbsp;Keywords:&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt;,&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Screen Scraping Software&lt;/a&gt;,&lt;a href=&quot;http://www.knowlesys.com/services.htm&quot; rel=&quot;nofollow&quot;&gt; Screen Scraping&lt;/a&gt;,&lt;a href=&quot;http://www.knowlesys.com/testimonials.htm&quot; rel=&quot;nofollow&quot;&gt;Html Scraping&lt;/a&gt;,Net Screen Scraping,&lt;a href=&quot;http://www.knowlesys.com/products.htm&quot; rel=&quot;nofollow&quot;&gt;Scraping Web Pages&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt;&lt;br /&gt; Web2DB&amp;nbsp;&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt;. Capture, process and filter text and images from webpages.&lt;br /&gt; &lt;br /&gt; What is Web2DB? &lt;br /&gt; Web2DB is a&amp;nbsp;&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt; service. It takes unstructured data from web html pages and converting it into structured records.&lt;br /&gt; &lt;br /&gt; You tell us where you want to search, what you want to get, &lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt;,and how you want it formatted. We do all the work and send the results directly to you. The database format could be Excel, CSV, Access, MSSQL, and MySQL.(&lt;a href=&quot;http://www.knowlesys.com/articles/1t50/data_capture_software.htm&quot; rel=&quot;nofollow&quot;&gt;Web Scraping Software &lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;More information,please vist: &lt;a href=&quot;http://www.knowlesys.com/&quot; rel=&quot;nofollow&quot;&gt;http://www.knowlesys.com/&lt;/a&gt;&lt;br /&gt; Data Collection Examples&#65306;&lt;a href=&quot;http://www.knowlesys.com/examples.htm&quot; rel=&quot;nofollow&quot;&gt;http://www.knowlesys.com/examples.htm&lt;/a&gt;&lt;/p&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
      <dc:creator>Knowlesys  Software Inc.</dc:creator>
      <pubDate>Wed, 19 Nov 2008 19:19:37 -0600</pubDate>
      <link>http://activerain.com/blogsview/798313/web-scraping-software</link>
    </item>
  </channel>
</rss>
