折腾:
【已解决】Flask中ms的tts返回401感觉是获取token错误导致无法生成语音文件
虽然已经解决了Azure的获取token的问题了,但是此处又发现其他的问题:
Mac本地调试单个Flask的实例的时候,对于单个文件中,不同函数中共享全局变量的值:
resources/tts.py
<code>gMsToken = "" def initAudioService(): """ init audio service related related: init token :return: """ # log = app.logger log.info("initAudioService") createAudioTempFolder() # getBaiduToken() getAzureSpeechToken() log.info("Init audio service complete") def getAzureSpeechToken(): """get Microsoft Azure speech service token key""" # log = app.logger global gMsToken log.info("getAzureSpeechToken: gMsToken=%s", gMsToken) gMsToken = refreshAzureSpeechToken() log.info("after getAzureSpeechToken: gMsToken=%s", gMsToken) def msTTS(unicodeText, voiceName=settings.MS_TTS_VOICE_NAME, voiceRate=settings.MS_TTS_VOICE_RATE, voiceVolume=settings.MS_TTS_VOICE_VOLUME): """call ms azure tts to generate audio(mp3/wav/...) from text""" global gMsToken # log = app.logger log.info("msTTS: unicodeText=%s, gMsToken=%s", unicodeText, gMsToken) if not gMsToken: getAzureSpeechToken() isOk = False audioBinData = None errNo = 0 errMsg = "Unknown error" msTtsUrl = settings.MS_TTS_URL log.info("msTtsUrl=%s", msTtsUrl) reqHeaders = { "Content-Type": "application/ssml+xml", "X-Microsoft-OutputFormat": settings.MS_TTS_OUTPUT_FORMAT, "Ocp-Apim-Subscription-Key": settings.MS_TTS_SECRET_KEY, "Authorization": "Bear " + gMsToken } log.info("reqHeaders=%s", reqHeaders) ... ################################################################################ # Global Init ################################################################################ # testAudioSynthesis() initAudioService() log.info("TTS init complete") </code>
其中的全局变量:gMsToken,用来保存ms的tts的token,
是可以正常工作的。
但是,在:
【未解决】在线环境中用gunicorn部署的产品demo无法正常初始化运行
把Flask通过gunicorn+supervisor部署到在线服务器的生产环境中后,由于多个实例和线程:
gunicorn的线程设置为cpu core个数 * 2 + 1,所以4核就是4×2+1=9个Flask的实例
然后从log中发现个问题:
<code>[2018-08-28 18:29:07,783 INFO tts.py:129 getAzureSpeechToken] getAzureSpeechToken: gMsToken= [2018-08-28 18:29:07,998 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJ1cm46bXMuY29nbml0aXZlc2VydmljZXMiLCJleHAiOiIxNTM1NDUyNzQ3IiwicmVnaW9uIjoid2VzdHVzIiwic3Vic2NyaXB0aW9uLWlkIjoiNzU5ZWE5MjcyZjhiNDdhMzg5MWVkYzQ5ODhhMmFkNDEiLCJwcm9kdWN0LWlkIjoiU3BlZWNoU2VydmljZXMuUzAiLCJjb2duaXRpdmUtc2VydmljZXMtZW5kcG9pbnQiOiJodHRwczovL2FwaS5jb2duaXRpdmUubWljcm9zb2Z0LmNvbS9pbnRlcm5hbC92MS4wLyIsImF6dXJlLXJlc291cmNlLWlkIjoiL3N1YnNjcmlwdGlvbnMvZDA1NWNjMjMtZDY5OS00YjAyLTg0YzgtMDBlOTVmNzA4ZDFmL3Jlc291cmNlR3JvdXBzL1NwZWVjaC9wcm92aWRlcnMvTWljcm9zb2Z0LkNvZ25pdGl2ZVNlcnZpY2VzL2FjY291bnRzL0F6dXJlLVNwZWVjaCIsInNjb3BlIjoic3BlZWNoc2VydmljZXMiLCJhdWQiOiJ1cm46bXMuc3BlZWNoc2VydmljZXMud2VzdHVzIn0.R8Zfe0YyPhzpe-QITjGmbpJuSP1tpa5elFG0YYPVAyM [2018-08-28 18:29:07,999 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,000 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,000 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,072 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d6cbe0> [2018-08-28 18:29:08,197 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJ1cm46bXMuY29nbml0aXZlc2VydmljZXMiLCJleHAiOiIxNTM1NDUyNzQ3IiwicmVnaW9uIjoid2VzdHVzIiwic3Vic2NyaXB0aW9uLWlkIjoiNzU5ZWE5MjcyZjhiNDdhMzg5MWVkYzQ5ODhhMmFkNDEiLCJwcm9kdWN0LWlkIjoiU3BlZWNoU2VydmljZXMuUzAiLCJjb2duaXRpdmUtc2VydmljZXMtZW5kcG9pbnQiOiJodHRwczovL2FwaS5jb2duaXRpdmUubWljcm9zb2Z0LmNvbS9pbnRlcm5hbC92MS4wLyIsImF6dXJlLXJlc291cmNlLWlkIjoiL3N1YnNjcmlwdGlvbnMvZDA1NWNjMjMtZDY5OS00YjAyLTg0YzgtMDBlOTVmNzA4ZDFmL3Jlc291cmNlR3JvdXBzL1NwZWVjaC9wcm92aWRlcnMvTWljcm9zb2Z0LkNvZ25pdGl2ZVNlcnZpY2VzL2FjY291bnRzL0F6dXJlLVNwZWVjaCIsInNjb3BlIjoic3BlZWNoc2VydmljZXMiLCJhdWQiOiJ1cm46bXMuc3BlZWNoc2VydmljZXMud2VzdHVzIn0.R8Zfe0YyPhzpe-QITjGmbpJuSP1tpa5elFG0YYPVAyM [2018-08-28 18:29:08,198 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,198 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,198 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,265 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d69c50> [2018-08-28 18:29:08,434 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJ1cm46bXMuY29nbml0aXZlc2VydmljZXMiLCJleHAiOiIxNTM1NDUyNzQ4IiwicmVnaW9uIjoid2VzdHVzIiwic3Vic2NyaXB0aW9uLWlkIjoiNzU5ZWE5MjcyZjhiNDdhMzg5MWVkYzQ5ODhhMmFkNDEiLCJwcm9kdWN0LWlkIjoiU3BlZWNoU2VydmljZXMuUzAiLCJjb2duaXRpdmUtc2VydmljZXMtZW5kcG9pbnQiOiJodHRwczovL2FwaS5jb2duaXRpdmUubWljcm9zb2Z0LmNvbS9pbnRlcm5hbC92MS4wLyIsImF6dXJlLXJlc291cmNlLWlkIjoiL3N1YnNjcmlwdGlvbnMvZDA1NWNjMjMtZDY5OS00YjAyLTg0YzgtMDBlOTVmNzA4ZDFmL3Jlc291cmNlR3JvdXBzL1NwZWVjaC9wcm92aWRlcnMvTWljcm9zb2Z0LkNvZ25pdGl2ZVNlcnZpY2VzL2FjY291bnRzL0F6dXJlLVNwZWVjaCIsInNjb3BlIjoic3BlZWNoc2VydmljZXMiLCJhdWQiOiJ1cm46bXMuc3BlZWNoc2VydmljZXMud2VzdHVzIn0.xAr4O88waREMaTgzPOt4s7J8yLglU4nnSwmaz01qavc [2018-08-28 18:29:08,435 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,436 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,436 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,474 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGcixxx4nnSwmaz01qavc [2018-08-28 18:29:08,475 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,476 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,476 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,496 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGcixxx4s7J8yLglU4nnSwmaz01qavc [2018-08-28 18:29:08,497 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,497 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,497 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,512 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d68cc0> [2018-08-28 18:29:08,571 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d68da0> [2018-08-28 18:29:08,603 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d69d30> [2018-08-28 18:29:08,753 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=None [2018-08-28 18:29:08,755 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,755 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,755 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,822 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=None [2018-08-28 18:29:08,823 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:08,823 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:08,824 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:08,841 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d69e80> [2018-08-28 18:29:08,893 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d69ef0> [2018-08-28 18:29:08,998 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=eyJhbGcixxxnSwmaz01qavc [2018-08-28 18:29:08,999 INFO tts.py:123 initAudioService] Init audio service complete [2018-08-28 18:29:09,000 INFO tts.py:439 <module>] TTS init complete [2018-08-28 18:29:09,000 INFO qa.py:18 <module>] resourcesPath=/xxx/robotDemo/resources [2018-08-28 18:29:09,129 INFO qa.py:26 <module>] aiContext=<DialogueManager.Context object at 0x7f7eb0d69f60> [2018-08-28 18:29:09,132 INFO tts.py:131 getAzureSpeechToken] after getAzureSpeechToken: gMsToken=None [2018-08-28 18:29:09,134 INFO tts.py:123 initAudioService] Init audio service complete </code>
可见问题:
(1)其中有些gMsToken是有正常的值,有些是None
-》导致后续代码中msTTS中reqHeaders的gMsToken是None,无法正常使用ms的tss
(2)另外:本身(多个实例,多个进程中)去获取多个ms的token,本身也是浪费和不好的做法:
获取新的token -》导致刚刚获取的旧的token失效了
-》所以希望是:
能够多个线程,全局共享一个token
-》而在合并兜底对话期间,借鉴到别人的代码中:
nlp/search/qa/iqa.py
<code>class Singleton(type): """ reference: https://stackoverflow.com/questions/31875/is-there-a-simple-elegant-way-to-define-singletons """ _instances = {} def __call__(cls, *args, **kwargs): if cls not in cls._instances: cls._instances[cls] = super( Singleton, cls).__call__(*args, **kwargs) return cls._instances[cls] class SearchBasedQA(metaclass=Singleton): static_bt = None ... </code>
是可以通过singleton单例,去实现全局,多线程,多进程,共享一个变量或类的
所以后续去尝试去搞定:
Flask 部署后 多实例 全局变量的共享冲突/被覆盖的问题,
看看除了 单例singleton之外 是否还有其他更好的办法
不过还是先去看看:
【部分解决】Python中实现多线程或多进程中的单例singleton
然后优化后的,全局的,所有线程都共享一个实例,单例,去获取ms的azure的tts的token的代码是:
common/ThreadSafeSingleton.py
<code>import functools import threading thread_lock = threading.Lock() print("ThreadSafeSingleton: thread_lock=%s" % thread_lock) # refer: https://stackoverflow.com/questions/50566934/why-is-this-singleton-implementation-not-thread-safe def synchronized(lock): """ Synchronization decorator """ def wrapper(f): print("synchronized: wrapper: f=%s, lock=%s" % (f, lock)) @functools.wraps(f) def inner_wrapper(*args, **kw): print("functools.wraps: args=%s, kw=%s" % (args, kw)) with lock: return f(*args, **kw) print("inner_wrapper%s" % inner_wrapper) return inner_wrapper return wrapper # class Singleton(type): class ThreadSafeSingleton(type): _instances = {} @synchronized(thread_lock) def __call__(cls, *args, **kwargs): print("synchronized __call__: cls=%s, args=%s, kwargs=%s" % (cls, args, kwargs)) print("cls._instances=%s" % cls._instances) if cls not in cls._instances: # cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs) cls._instances[cls] = super(ThreadSafeSingleton, cls).__call__(*args, **kwargs) print("after added _instances: cls._instances=%s" % cls._instances) return cls._instances[cls] </code>
然后在此期间,为了正常的能用上多线程的全局的单例的log,再去:
【已解决】把Flask中的app的logger改造成单例以避免循环引用和多次初始化Flask的实例
然后:
【部分解决】Python中实现多线程或多进程中的单例singleton
但是:
不过,对于此处的业务逻辑来说,倒是暂时可以继续试用的:
因为之前
gunicorn:多worker(9个),type时sync,导致多process
导致多个ms的token去初始化,其中部分(3)个token初始化有误
而现在改为:
gunicorn:单worker,type为gevent,的确是单process
虽然不知何故,flask的app中的log中还有3个process去初始化token,但是返回都是200,token都正常
-》后续调用ms的tts去文字转语音,暂时还是可以正常使用的。