• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    公众号

xeneta/LeadQualifier: Qualify sales leads with machine learning

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

xeneta/LeadQualifier

开源软件地址(OpenSource Url):

https://github.com/xeneta/LeadQualifier

开源编程语言(OpenSource Language):

Python 100.0%

开源软件介绍(OpenSource Introduction):

LeadQualifier

This repo is a collection of scripts we use at Xeneta to qualify sales leads with machine learning. Read more about this project in the Medium article Boosting Sales With Machine Learning.

You can use this repo for two things:

  1. Try to beat our predictions using our data and your own algorithm
  2. Create a lead qualifier for your company, using your own data

Setup

Start off by running the following command:

pip install -r requirements.txt

You'll also need to download the stopword from the nltk package. Run the Python interpreter and type the following:

import nltk
nltk.download('stopwords')

1. Experiment with your own algorithms

We'd love to see more algorithms on the leaderboard, so send us a pull request once you've implemented one.

Xeneta Qualifier

We've provided you with our vectorized and transformed data here. We can unfortunately not share the raw text data, as it contains sensitive company information (who our customers are).

To test our your own algorithm, simply add it the run.py file and run the script:

python run.py

Thanks to lampts for implementing the best performing algorithm so far, the SGDClassifier.

Leaderboard:

Algorithm Precision Recall F1 Score
SGD Classifier 0.872 0.940 0.905
Random Forest 0.845 0.915 0.878

PS: We're also experimenting with a neural net (in TensorFlow) in the nn.py file.

2. Create your own lead qualifier

To create your own lead qualifier, you'll need to get hold of company descriptions (to create your dataset). We currently use FullContact for this.

Note: We've added dummy data, so that you can run both scripts without getting errors, and to give you examples on how the sheets should look like.

Train Algorithm

This script trains an algorithm on your own input data. It expects two excel sheets named qualified and disqualified in the input folder. These sheets need to contain two columns:

  • URL
  • Description

Run the script:

python run.py

It'll dump three files into the qualify_leads project:

  • algorithm
  • vectorizer
  • tfidf_vectorizer

You're now ready to start classifying your sales leads!

Qualify Leads

This is the script that actually predicts the quality of your leads. Add an excel sheet named data in the input folder. Use the same format as the example file that's already there.

Run the script:

python run.py

It'll output an excel sheet with a column named Prediction, where 1 equals qualified and 0 equals disqualified:

Got questions? Email me at [email protected].




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap