手把手教你写一个中文聊天机器人

作者/分享人:赵英俊(Enjoy)
向 Ta 提问
从事互联网技术相关行业5年,目前专注于产业智能互联网的研究和实践,在NLP,AutoML领域有研究和实践,并会持续的推出技术分享和研究成果。

借助 TensorFlow 1.0.1,利用 Seq2Seq,使用 Python 3 完成一个中文聊天机器人的设计、训练、在线聊天,支持微信接入。在这次的分享中,没有高深的算法和枯燥难懂的数学公式,更多是从应用的角度,利用当前现有的机器学习平台完成一个中文聊天机器人的设计和实现,让更多人能够体会到人工智能带来的乐趣。该机器人非常适合作为在线客服,在线问答的场景。

过程中,我会深入浅出地讲解相关原理和设计思路、也会详细讲解相关源码设计,最后带领大家一起做一个属于自己的聊天机器人。

已有498人预订
预订达标
文章出炉
交流日期
     
17.12.25
01月08日
01月15日 20:30
查看文章评论/提问
差不多先生
手把手我可真学不会啊,可能我的手不听我的使唤。
Mystery
请问,源码在哪呢?🤓
赵英俊(Enjoy): 源码地址:https://github.com/zhaoyingjun/chatbot
whyseu
同问,源码在哪呢?
赵英俊(Enjoy): 地址:https://github.com/zhaoyingjun/chatbot
Mr.C
遇到的一些错误和解决方法,供参考 预处理:读seq2seq.ini有编码错误,用notepad++改了编码格式,好了; 读对话数据时Deeplearning编码错误,在open函数里加了encoding='UTF-8' 训练:出现can't pickle _thread.lock objects,在seq2seq.py里加了四行 setattr(tf.contrib.rnn.GRUCell, '__deepcopy__', lambda self, _: self) setattr(tf.contrib.rnn.BasicLSTMCell, '__deepcopy__', lambda self, _: self) setattr(tf.contrib.rnn.MultiRNNCell, '__deepcopy__', lambda self, _: self)
拾薪: 你好,您遇到的这些问题我也都遇到了,但是就是这个can't pickle object无法解决,请问具体是在哪个地方加入这四句,具体原理是什么呢?能否麻烦您解答下
xianjin
/usr/local/bin/python3 /Users/xianjin/Desktop/app/chatbot/app.py /usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds) 2018-01-16 19:05:30.474447: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA Traceback (most recent call last): File "/Users/xianjin/Desktop/app/chatbot/app.py", line 67, in <module> sess, model, enc_vocab, rev_dec_vocab = execute.init_session(sess, conf='seq2seq_serve.ini') File "/Users/xianjin/Desktop/app/chatbot/execute.py", line 214, in init_session model = create_model(sess, True) File "/Users/xianjin/Desktop/app/chatbot/execute.py", line 94, in create_model model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only) File "/Users/xianjin/Desktop/app/chatbot/seq2seq_model.py", line 136, in __init__ softmax_loss_function=softmax_loss_function) File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1224, in model_with_buckets softmax_loss_function=softmax_loss_function)) File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1137, in sequence_loss softmax_loss_function=softmax_loss_function)) File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1092, in sequence_loss_by_example crossent = softmax_loss_function(labels=target, logits=logit) TypeError: sampled_loss() got an unexpected keyword argument 'logits'
Mr.C
错误sampled_loss() got an unexpected keyword argument 'logits' 把'sampled_loss(inputs, labels)函数里' 改成 'sampled_loss(labels, logits) : labels = tf.reshape(labels, [-1, 1]) return tf.nn.sampled_softmax_loss(w_t, b, labels, logits, num_samples, self.target_vocab_size) ' , 好了
哒哒哒: Traceback (most recent call last): File "data_utls.py", line 12, in <module> gConfig=getConfig.get_config() File "C:\Users\jxiong\Desktop\chatbot-master\getConfig.py", line 26, in get_config parser.read(config_file) File "C:\Users\jxiong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 696, in read self._read(fp, filename) File "C:\Users\jxiong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 1012, in _read for lineno, line in enumerate(fp, start=1): UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 69: illegal multibyte sequence 那这个你知道怎么回事吗
北海.618
GBK那个问题,把seq2seq.ini里的中文都删掉就好了
庭震: 北海兄,如你所说解决了楼上的gbk问题,然后又报了一个gbk问题,你有遇到吗? D:\ProgramData\Anaconda3\python.exe E:/PycharmProjects/chatbot-master_zhaoyingjun/chatbot-master/data_utls.py Traceback (most recent call last): File "E:/PycharmProjects/chatbot-master_zhaoyingjun/chatbot-master/data_utls.py", line 23, in <module> for line in f: UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 21: illegal multibyte sequence Process finished with exit code 1
庭震: 解决了,分享一下:data_utls.py里面,with open(conv_path) as f:改成 with open(conv_path, 'r',encoding='utf-8') as f:
庭震
=============请问,TensorFlow 不是1.0版本,如下报错有什么建议吗?=========== Traceback (most recent call last): File "E:/PycharmProjects/chatbot-master_zhaoyingjun/chatbot-master/execute.py", line 259, in <module> train() File "E:/PycharmProjects/chatbot-master_zhaoyingjun/chatbot-master/execute.py", line 124, in train model = create_model(sess, False) File "E:/PycharmProjects/chatbot-master_zhaoyingjun/chatbot-master/execute.py", line 94, in create_model model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only) File "E:\PycharmProjects\chatbot-master_zhaoyingjun\chatbot-master\seq2seq_model.py", line 149, in __init__ softmax_loss_function=softmax_loss_function) File "D:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\legacy_seq2seq\python\ops\seq2seq.py", line 1224, in model_with_buckets softmax_loss_function=softmax_loss_function)) File "D:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\legacy_seq2seq\python\ops\seq2seq.py", line 1137, in sequence_loss softmax_loss_function=softmax_loss_function)) File "D:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\tensorflow\contrib\legacy_seq2seq\python\ops\seq2seq.py", line 1092, in sequence_loss_by_example crossent = softmax_loss_function(labels=target, logits=logit) TypeError: sampled_loss() got an unexpected keyword argument 'logits'
陈小隆
源码呢
赵英俊(Enjoy)
源码地址:https://github.com/zhaoyingjun/chatbot
Why
怎么收藏呢 ?
Mr.C
用的示例中的样本集,训练了一晚上8个小时了,到9900step了,什么时候是个头呀,电脑卡的动不了(┬_┬)
: 你有训练好了吗?
🔜: 什时候是个头,您训练到了多少步
刘佳恒
非常感谢您的博客,有一个小问题,词汇是按照降序排列的,博客里面可能误写了
哒哒哒
Traceback (most recent call last): File "data_utls.py", line 12, in <module> gConfig=getConfig.get_config() File "C:\Users\jxiong\Desktop\chatbot-master\getConfig.py", line 26, in get_config parser.read(config_file) File "C:\Users\jxiong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 696, in read self._read(fp, filename) File "C:\Users\jxiong\AppData\Local\Programs\Python\Python36\lib\configparser.py", line 1012, in _read for lineno, line in enumerate(fp, start=1): UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 69: illegal multibyte sequence 作者大大,请问运行第一个数据处理的时候出现这种错误是怎么回事呢?
兀自嚣: 求问楼主现在这个问题解决了吗。我把这个文件转成UTF-8的格式但还是报这个错误,应该是编码方面的问题吧
赵黄油的fly
为什么我的问句之后返回的都是unk,没有一个汉字或其他字符。求教,这里是哪里问题
你可能还喜欢
Spring Cloud Consul 从入门到精通
如梦技术 dreamlu
ACT 敏捷教练培养体系
Chat 三人行
Spring 注解是如何实现的
飞翔
程序员的自我进化:学习之道,如何更有效的学习
Soyoger
高并发、高性能 Web 架构解决方案
Array老师
Spring Data JPA 晋级提升篇:复杂场景实战用法与优化
张振华
微信扫描登录