对于:
【已解决】汽车之家车型车系数据:优化去掉js加速抓取车型参数配置
去调试,结果运行报错
list index out of range mIndex=16 in getItemFirstValue
很明显是:
列表index超出最大个数了。
去找到出错情况的页面去调试
找到原因了
对于:
之前汽油车

有的上面字段
对于此处的:
却没有:

所以:
需要想办法去找到两者区别
什么时候才会有上面字段
什么时候才会没有
难道是:
是:
车身结构:客车
没有这些 普通的车,比如
车身结构:5门5座SUV
才有的字段?因为看起来都是 加速之类的
普通客车不看中这些参数,所以没有?
所以就要把之前以为的 汽油车固定的字段:
if carEnergyType == "汽油":
# https://car.autohome.com.cn/config/spec/43593.html
# https://car.autohome.com.cn/config/spec/41572.html
# self.processGasolineCar(valueContent, carModelDict)
# https://car.autohome.com.cn/config/spec/1006466.html
gasolineCarKeyIdxMapDict = {
"carModelEnvStandard" : 3,
"carModelReleaseTime" : 4,
"carModelMaxPower" : 5,
"carModelMaxTorque" : 6,
"carModelEngine" : 7,
"carModelGearBox" : 8,
"carModelSize" : 9,
"carModelBodyStructure" : 10,
"carModelMaxSpeed" : 11,
"carModelOfficialSpeedupTime" : 12,
"carModelActualTestSpeedupTime" : 13,
"carModelActualTestBrakeDistance" : 14,
"carModelMiitCompositeFuelConsumption" : 15,
"carModelActualFuelConsumption" : 16,
}
wholeWarrantyIdx = 17需要去想办法,动态调整了。
当然知道,最完美的情况是:直接匹配字段名字
但是由于此处字段是js和css特殊处理的
所以没法直接匹配
好像突然发现了一个细节:
好像id是一致的
纯电动:
debug/奥迪Q2L_etron_纯电智享型_39893_notRunJs_config.json
{
"id": 1186,
"name": "<span class='hs_kw8_configpl'></span><span class='hs_kw2_configpl'></span>(N·m)",
"pnid": "1_-1",
"valueitems": [{
"specid": 39893,
"value": "290"
}, {
"specid": 42875,
"value": "290"
}]
},
。。。
{
"id": 1246,
"name": "最高车速(km/h)",
"pnid": "1_-1",
"valueitems": [{
"specid": 39893,
"value": "150"
}, {
"specid": 42875,
"value": "150"
}]
}
。。。
}, {
"id": 1255,
"name": "整车<span class='hs_kw36_configpl'></span>",
"pnid": "1_-1",
"valueitems": [{
"specid": 39893,
"value": "三<span class='hs_kw7_configpl'></span>10<span class='hs_kw1_configpl'></span>公里"
}, {
"specid": 42875,
"value": "三<span class='hs_kw7_configpl'></span>10<span class='hs_kw1_configpl'></span>公里"
}]
}]和:
汽油车
debug/九龙A6_1006466.json
{
"id": 1186,
"name": "<span class='hs_kw14_configrI'></span><span class='hs_kw4_configrI'></span>(N·m)",
"pnid": "1_-1",
"valueitems": [{
"specid": 1006466,
"value": "290"
}, {
"specid": 1006465,
"value": "260"
}, {
"specid": 1006467,
"value": "330"
}]
},
。。。
}, {
"id": 1246,
"name": "最高车速(km/h)",
"pnid": "1_-1",
"valueitems": [{
"specid": 1006466,
"value": "-"
}, {
"specid": 1006465,
"value": "-"
}, {
"specid": 1006467,
"value": "-"
}]
}, {
"id": 1251,
"name": "工信部<span class='hs_kw41_configrI'></span><span class='hs_kw35_configrI'></span>(L/100km)",
"pnid": "1_-1",
"valueitems": [{
"specid": 1006466,
"value": "-"
}, {
"specid": 1006465,
"value": "-"
}, {
"specid": 1006467,
"value": "-"
}]
}, {
"id": 1255,
"name": "整车<span class='hs_kw48_configrI'></span>",
"pnid": "1_-1",
"valueitems": [{
"specid": 1006466,
"value": "-"
}, {
"specid": 1006465,
"value": "-"
}, {
"specid": 1006467,
"value": "-"
}]
}]-》
- 1186:最大扭矩(N·m)
- 1251:工信部综合油耗:(L/100km)
等等
-》所以可以直接通过字段中的id去匹配要的字段即可
从逻辑上更加简单了。
而无需判断是哪种燃油类型了
不过貌似有些字段的id是0
所以需要去搞清楚
去整理对应id和字段关系
"id": 1149, "name": "能源类型",
等等
发现有id是0的:
"id": 0, "name": "上市<span class='hs_kw51_configvR'></span>",
待后续找规律
目前对于
来说,只有
"id": 0, "name": "上市<span class='hs_kw51_configvR'></span>", # 上市时间
的id是0,其他基本参数的字段的id都不是0,就好办了。
再去多找几个看看是否同样规律
汽油车
不过,突然发现,之前调试的html中已有完全id的字段定义:
debug/奥迪A3_configSpec_43593.html
var keyLink = [{
"id": 1339,
"link": "https://car.autohome.com.cn/baike/detail_8_26_1339.html",
"name": "<span class='hs_kw27_baikefn'></span><span class='hs_kw85_baikefn'></span>/<span class='hs_kw9_baikefn'></span>"
}, {
"id": 1340,
"link": "https://car.autohome.com.cn/baike/detail_8_27_1340.html",
"name": "尾门玻璃<span class='hs_kw63_baikefn'></span>开启"
}, {
"id": 1341,
"link": "https://car.autohome.com.cn/baike/detail_8_30_1341.html",
"name": "<span class='hs_kw76_baikefn'></span>数量"
}, {
"id": 1342,
"link": "https://car.autohome.com.cn/baike/detail_8_31_1342.html",
"name": "<span class='hs_kw12_baikefn'></span>大灯雨雾模式"
}, {
"id": 1343,
"link": "https://car.autohome.com.cn/baike/detail_8_30_1343.html",
"name": "车载CD/DVD"
}, {
。。。
}, {
"id": 1234,
"link": "https://car.autohome.com.cn/baike/detail_7_21_1234.html",
"name": "<span class='hs_kw12_baikefn'></span>电动机<span class='hs_kw7_baikefn'></span><span class='hs_kw35_baikefn'></span>(kW)"
}, {
"id": 1242,
"link": "https://car.autohome.com.cn/baike/detail_8_31_1242.html",
"name": "车<span class='hs_kw12_baikefn'></span>雾灯"
}, {
"id": 1245,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1245.html",
"name": "变速箱"
}, {
"id": 1246,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1246.html",
"name": "最高车速(km/h)"
}, {
"id": 1250,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1250.html",
"name": "官方0-100km/h加速(s)"
}, {
"id": 1251,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1251.html",
"name": "工信部<span class='hs_kw22_baikefn'></span><span class='hs_kw17_baikefn'></span>(L/100km)"
}, {
"id": 1252,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1252.html",
"name": "<span class='hs_kw68_baikefn'></span>0-100km/h加速(s)"
}, {
"id": 1253,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1253.html",
"name": "<span class='hs_kw68_baikefn'></span>100-0km/h制动(m)"
}, {
"id": 1254,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1254.html",
"name": "<span class='hs_kw68_baikefn'></span><span class='hs_kw17_baikefn'></span>(L/100km)"
}, {
"id": 1255,
"link": "https://car.autohome.com.cn/baike/detail_7_18_1255.html",
"name": "整车<span class='hs_kw77_baikefn'></span>"
}, {
"id": 1256,
"link": "https://car.autohome.com.cn/baike/detail_7_19_1256.html",
"name": "<span class='hs_kw68_baikefn'></span><span class='hs_kw2_baikefn'></span>(mm)"
}, {
。。。
}, {
"id": 1290,
"link": "https://car.autohome.com.cn/baike/detail_7_21_1290.html",
"name": "百公里耗<span class='hs_kw56_baikefn'></span>(kWh/100km)"
}, {
"id": 1291,
"link": "https://car.autohome.com.cn/baike/detail_7_21_1291.html",
"name": "工信部纯电续航里程(km)"
}, {
。。。不过对于定义具体字段,用处没想的那么大
还是需要事先研究清楚,定义好
搜了:
"id": 0,
id是0的,并不多,只有9个左右。
其他几十个,都是有id的。
目前上面的需要的内容中,特殊的
上市时间
目前id是0
这部分值是
}, {
"id": 0,
"name": "上市<span class='hs_kw61_configHa'></span>",
"pnid": "1_-1",
"valueitems": [{
"specid": 43593,
"value": "2020.04"
}, {
"specid": 42418,
"value": "2019.10"
}, {
。。。-》可以通过
- name符合 上市开头(或 时间结束)
- 找了其他地方,没有 上市 开头的字段了
- 不会重复,这个逻辑可用
- value是 YYYY.MM 格式
去判断
所以,目前够用了。
再去补全 其他类型车的字段
但是补全了电动车字段后:
# 电动车 参数
{
"id": 1291,
"name": "工信部纯电续航里程(km)",
"key": "carModelMiitEnduranceMileagePureElectric",
}, {
"id": 1292,
# "name": "<span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "快充时间(小时)",
"key": "carModelQuickCharge",
}, {
"id": 0,
# "name": "<span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "慢充时间(小时)",
"key": "carModelSlowCharge",
}, {
"id": 0,
"name": "快充电量百分比",
"key": "carModelQuickChargePercent",
}, {
"id": 0,
"name": "电动机(Ps)",
"key": "carModelHorsePowerElectric",
}, {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span>续航里程(km)",
"name": "实测续航里程(km)",
"key": "carModelActualTestEnduranceMileage",
}, {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "实测快充时间(小时)",
"key": "carModelActualTestQuickCharge",
}, {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "实测慢充时间(小时)",
"key": "carModelActualTestSlowCharge",
}发现个问题:
有多个字段的id是0
且根据name 没法直接判断是哪个
尤其是:
"id": 0, # "name": "<span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)", "name": "慢充时间(小时)", "id": 0, # "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)", "name": "实测快充时间(小时)", "id": 0, # "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)", "name": "实测慢充时间(小时)",
都是充电时间,根本无法区分开
不过如果实在区分不开,对于后2个字段:
- 实测快充时间(小时)
- 实测慢充时间(小时)
就:不去抓取
因为也看到,除了:

另外的 纯电动
字段也都是空:

不过对于:
- 慢充时间(小时)
都是有值的
所以最好还是去抓取的。
不过实在不行,可以去根据位置判断:
慢充时间
的前面一个 肯定是:
快充时间
-》

而
快充时间
是有id的
}, {
"id": 1292,
# "name": "<span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "快充时间(小时)",
"key": "carModelQuickCharge",
}, {
"id": 0,
# "name": "<span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "慢充时间(小时)",
"key": "carModelSlowCharge",
},所以可以先找 快充时间的index,
再加1后,
且:
id=0
name 末尾是 (小时)
确定就是:
慢充时间
了。
突然想到:
对于:
- 实测快充时间(小时)
- 实测慢充时间(小时)
也可以根据位置去计算:
2个位置肯定是:
- 实测续航里程(km)
后面的2个

所以也可以用位置去判断
即:
找到:
- 实测续航里程(km)
的后面2个
(如果没超过list的index的话)
然后后面2个,都满足:
id=0
name末尾是 (小时)
则就可以确定分别是:
- 实测快充时间(小时)
- 实测慢充时间(小时)
至此,去写代码
目前已经用代码:
@catch_status_code_error
def carConfigSpecCallback(self, response):
print("in carConfigSpecCallback")
curCarModelDict = response.save
print("curCarModelDict=%s" % curCarModelDict)
carModelDict = copy.deepcopy(curCarModelDict)
configSpecHtml = response.text
# print("configSpecHtml=%s" % configSpecHtml)
# print("")
# # for debug
# return
# # config json item index - spec table html item index = 2
# ItemIndexDiff = 2
# isUseSpecTableHtml = True
# isUseConfigJson = False
# valueContent = None
# energyTypeIdx = 2
# # Method 1: after run js, extract item value from spec table html
# """
# <table class="tbcs" id="tab_0" style="width: 932px;">
# <tbody>
# <tr>
# <th class="cstitle" show="1" pid="tab_0" id="nav_meto_0" colspan="5">
# <h3><span>基本参数</span></h3>
# </th>
# </tr>
# <tr data-pnid="1_-1" id="tr_0">
# """
# tbodyDoc = response.doc("table[id='tab_0'] tbody")
# print("tbodyDoc=%s" % tbodyDoc)
# valueContent = tbodyDoc
# isUseSpecTableHtml = True
# isUseConfigJson = False
# energyTypeIdx = 2
# Method 2: not run js, extract item value from config json
# get value from config json
# var config = {"message" ...... "returncode":"0","taskid":"8be676a3-e023-4fa9-826d-09cd42a1810c","time":"2020-08-27 20:56:17"};
foundConfigJson = re.search("var\s*config\s*=\s*(?P<configJson>\{[^;]+\});", configSpecHtml)
print("foundConfigJson=%s" % foundConfigJson)
if foundConfigJson:
configJson = foundConfigJson.group("configJson")
print("configJson=%s" % configJson)
# configDict = json.loads(configJson, encoding="utf-8")
configDict = json.loads(configJson)
print("configDict=%s" % configDict)
# if "result" in configDict:
configResultDict = configDict["result"]
print("configResultDict=%s" % configResultDict)
# if "paramtypeitems" in configResultDict:
paramTypeItemDictList = configResultDict["paramtypeitems"]
print("paramTypeItemDictList=%s" % paramTypeItemDictList)
# paramTypeItemNum = len(paramTypeItemDictList)
# print("paramTypeItemNum=%s" % paramTypeItemNum)
basicParamDict = paramTypeItemDictList[0]
print("basicParamDict=%s" % basicParamDict)
basicItemDictList = basicParamDict["paramitems"]
print("basicItemDictList=%s" % basicItemDictList)
# print("type(basicItemDictList)=%s" % type(basicItemDictList))
# basicItemNum = len(basicItemDictList)
# print("basicItemNum=%s" % basicItemNum)
# valueContent = basicItemDictList
# isUseSpecTableHtml = False
# isUseConfigJson = True
# process each basic parameter
basicItemDictLen = len(basicItemDictList)
print("basicItemDictLen=%s" % basicItemDictLen)
for curIdx, eachItemDict in enumerate(basicItemDictList):
print("[%d] eachItemDict=%s" % (curIdx, eachItemDict))
curItemId = eachItemDict["id"]
print("curItemId=%s" % curItemId)
curItemName = eachItemDict["name"]
print("curItemName=%s" % curItemName)
curItemFirstValue = self.extractValueItemsValue(eachItemDict)
print("curItemFirstValue=%s" % curItemFirstValue)
curIdNameKeyMapDict = None
if curItemId != 0:
curIdNameKeyMapDict = self.findMappingDict(curItemId)
else:
# id = 0
foundSpan = re.search("<span", curItemName)
print("foundSpan=%s" % foundSpan)
isSpecialName = bool(foundSpan)
print("isSpecialName=%s" % isSpecialName)
if isSpecialName:
# id=0 and contain '<span' special name
foundSuffixHour = re.search("</span>\(小时\)$", curItemName)
print("foundSuffixHour=%s" % foundSuffixHour)
isSpecialSuffixHour = bool(foundSuffixHour)
print("isSpecialSuffixHour=%s" % isSpecialSuffixHour)
if isSpecialSuffixHour:
prevIsQuickCharge = self.isPrevItemIsQuickCharge(curIdx, basicItemDictList)
print("prevIsQuickCharge=%s" % prevIsQuickCharge)
if prevIsQuickCharge:
# current is MUST 慢充时间(小时)
curIdNameKeyMapDict = {
"id": 0,
# "name": "<span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "慢充时间(小时)",
"namePattern": "</span>\(小时\)$",
"key": "carModelSlowCharge",
}
if not curIdNameKeyMapDict:
prevIsActualTestEnduranceMileage = self.isPrevItemIsActualTestEnduranceMileage(curIdx, basicItemDictList)
print("prevIsActualTestEnduranceMileage=%s" % prevIsActualTestEnduranceMileage)
if prevIsActualTestEnduranceMileage:
# current is MUST 实测快充时间(小时)
curIdNameKeyMapDict = {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "实测快充时间(小时)",
"namePattern": "</span>\(小时\)$",
"key": "carModelActualTestQuickCharge",
}
if not curIdNameKeyMapDict:
prevPrevIsActualTestEnduranceMileage = self.isPrevPrevItemIsActualTestEnduranceMileage(curIdx, basicItemDictList)
print("prevPrevIsActualTestEnduranceMileage=%s" % prevPrevIsActualTestEnduranceMileage)
if prevPrevIsActualTestEnduranceMileage:
# current is MUST 实测慢充时间(小时)
curIdNameKeyMapDict = {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "实测慢充时间(小时)",
"namePattern": "</span>\(小时\)$",
"key": "carModelActualTestSlowCharge",
}
else:
curIdNameKeyMapDict = self.findMappingDict(0, curItemName)
else:
curIdNameKeyMapDict = self.findMappingDict(0, curItemName)
print("curIdNameKeyMapDict=%s" % curIdNameKeyMapDict)
if curIdNameKeyMapDict:
curItemKey = curIdNameKeyMapDict["key"]
print("curItemKey=%s" % curItemKey)
if curItemKey == "carModelWholeWarranty":
print("process special carModelWholeWarranty")
# 整车质保
# 三<span class='hs_kw5_configJS'></span>10<span class='hs_kw0_configJS'></span>公里
print("curItemFirstValue=%s" % curItemFirstValue)
curItemFirstValue = self.extractWholeWarranty(curItemFirstValue)
print("curItemFirstValue=%s" % curItemFirstValue)
carModelDict[curItemKey] = curItemFirstValue
print("+++ added %s=%s" % (curItemKey, curItemFirstValue))
print("after extract all item value: carModelDict=%s" % carModelDict)
self.saveSingleResult(carModelDict)
else:
self.saveSingleResult(carModelDict)
# if isUseConfigJson:
# energyTypeIdx += ItemIndexDiff
# if valueContent:
# self.processDiffEneryTypeCar(carModelDict, valueContent, energyTypeIdx, isUseConfigJson, ItemIndexDiff)
# else:
# self.saveSingleResult(carModelDict)
def isPrevItemIsQuickCharge(self, curIdx, itemDictList):
print("in isPrevItemIsQuickCharge")
print("curIdx=%s" % curIdx)
prevIsQuickCharge = False
if curIdx > 0:
prevIdx = curIdx - 1
print("prevIdx=%s" % prevIdx)
prevItemDict = itemDictList[prevIdx]
print("prevItemDict=%s" % prevItemDict)
prevItemId = prevItemDict["id"]
print("prevItemId=%s" % prevItemId)
prevItemName = prevItemDict["name"]
print("prevItemName=%s" % prevItemName)
"""
"id": 1292,
# "name": "<span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "快充时间(小时)",
"""
QuickChargeItemId = 1292
if prevItemId == QuickChargeItemId:
prevIsQuickCharge = True
print("prevIsQuickCharge=%s" % prevIsQuickCharge)
return prevIsQuickCharge
def checkIsActualTestEnduranceMileage(self, prevSomeNum, curIdx, itemDictList):
print("in checkIsActualTestEnduranceMileage")
print("prevSomeNum=%s, curIdx=%s" % (prevSomeNum, curIdx))
isActualTestEnduranceMileage = False
minAllowIdx = prevSomeNum - 1
if curIdx > minAllowIdx:
prevSomeIdx = curIdx - prevSomeNum
print("prevSomeIdx=%s" % prevSomeIdx)
prevSomeItemDict = itemDictList[prevSomeIdx]
print("prevSomeItemDict=%s" % prevSomeItemDict)
prevSomeItemId = prevSomeItemDict["id"]
print("prevSomeItemId=%s" % prevSomeItemId)
prevSomeItemName = prevSomeItemDict["name"]
print("prevSomeItemName=%s" % prevSomeItemName)
if prevSomeItemId == 0:
"""
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span>续航里程(km)",
"name": "实测续航里程(km)",
"namePattern": "</span>续航里程\(km\)$",
"key": "carModelActualTestEnduranceMileage",
"""
foundActualTestEnduranceMileage = re.search("</span>续航里程\(km\)$", prevSomeItemName)
print("foundActualTestEnduranceMileage=%s" % foundActualTestEnduranceMileage)
if foundActualTestEnduranceMileage:
isActualTestEnduranceMileage = True
print("isActualTestEnduranceMileage=%s" % isActualTestEnduranceMileage)
return isActualTestEnduranceMileage
def isPrevItemIsActualTestEnduranceMileage(self, curIdx, itemDictList):
print("in isPrevItemIsActualTestEnduranceMileage")
print("curIdx=%s" % curIdx)
return self.checkIsActualTestEnduranceMileage(1, curIdx, itemDictList)
def isPrevPrevItemIsActualTestEnduranceMileage(self, curIdx, itemDictList):
print("in isPrevPrevItemIsActualTestEnduranceMileage")
print("curIdx=%s" % curIdx)
return self.checkIsActualTestEnduranceMileage(2, curIdx, itemDictList)
def findMappingDict(self, itemId=0, itemName=""):
foundMapDict = None
paramIdNameKeyMapDict = [
# 汽油车 参数
# https://car.autohome.com.cn/config/spec/41572.html
# https://car.autohome.com.cn/config/spec/1006465.html
{
"id": 1149,
"name": "能源类型",
"key": "carEnergyType",
}, {
"id": 1311,
"name": "环保标准",
"key": "carModelEnvStandard",
}, {
"id": 0,
# "name": "上市<span class='hs_kw51_configvR'></span>", # 上市时间
"name": "上市时间",
"namePattern": "^上市",
"key": "carModelReleaseTime",
}, {
"id": 1185,
# "name": "<span class='hs_kw40_configvR'></span><span class='hs_kw15_configvR'></span>(kW)",
"name": "最大功率(kW)",
"key": "carModelMaxPower",
}, {
"id": 1186,
# "name": "<span class='hs_kw40_configvR'></span><span class='hs_kw61_configvR'></span>(N·m)",
"name": "最大扭矩(N·m)",
"key": "carModelMaxTorque",
}, {
"id": 1150,
"name": "发动机",
"key": "carModelEngine",
}, {
"id": 1245,
"name": "变速箱",
"key": "carModelGearBox",
}, {
"id": 1148,
"name": "长*宽*高(mm)",
"key": "carModelSize",
}, {
"id": 1147,
"name": "车身结构",
"key": "carModelBodyStructure",
}, {
"id": 1246,
"name": "最高车速(km/h)",
"key": "carModelMaxSpeed",
}, {
"id": 1250,
"name": "官方0-100km/h加速(s)",
"key": "carModelOfficialSpeedupTime",
}, {
"id": 1252,
# "name": "<span class='hs_kw26_configvR'></span>0-100km/h加速(s)",
"name": "实测0-100km/h加速(s)",
"key": "carModelActualTestSpeedupTime",
}, {
"id": 1253,
# "name": "<span class='hs_kw26_configvR'></span>100-0km/h制动(m)",
"name": "实测100-0km/h制动(m)",
"key": "carModelActualTestBrakeDistance",
}, {
"id": 1251,
# "name": "工信部<span class='hs_kw10_configvR'></span><span class='hs_kw43_configvR'></span>(L/100km)",
"name": "工信部综合油耗(L/100km)",
"key": "carModelMiitCompositeFuelConsumption",
}, {
"id": 1254,
# "name": "<span class='hs_kw26_configvR'></span><span class='hs_kw43_configvR'></span>(L/100km)",
"name": "实测油耗(L/100km)",
"key": "carModelActualFuelConsumption",
}, {
"id": 1255,
# "name": "整车<span class='hs_kw73_configvR'></span>",
"name": "整车质保",
"key": "carModelWholeWarranty",
},
# 电动车 参数
# https://car.autohome.com.cn/config/spec/39893.html
# https://car.autohome.com.cn/config/spec/42875.html
{
"id": 1291,
"name": "工信部纯电续航里程(km)",
"key": "carModelMiitEnduranceMileagePureElectric",
}, {
"id": 1292,
# "name": "<span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
"name": "快充时间(小时)",
"key": "carModelQuickCharge",
# }, {
# "id": 0,
# # "name": "<span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
# "name": "慢充时间(小时)",
# "namePattern": "</span>\(小时\)$",
# "key": "carModelSlowCharge",
}, {
"id": 0,
# https://car.autohome.com.cn/config/spec/39893.html
# {'id': 0, 'name': "<span class='hs_kw39_configMh'></span><span class='hs_kw11_configMh'></span>百分比", 'pnid': '1_-1', 'valueitems': [{'specid': 39893, 'value': '80'}, {'specid': 42875, 'value': '80'}]}
"name": "快充电量百分比",
"namePattern": "</span>百分比$",
"key": "carModelQuickChargePercent",
}, {
"id": 0,
"name": "电动机(Ps)",
"key": "carModelHorsePowerElectric",
}, {
"id": 0,
# "name": "<span class='hs_kw22_configpl'></span>续航里程(km)",
"name": "实测续航里程(km)",
"namePattern": "</span>续航里程\(km\)$",
"key": "carModelActualTestEnduranceMileage",
# }, {
# "id": 0,
# # "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw39_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
# "name": "实测快充时间(小时)",
# "namePattern": "</span>\(小时\)$",
# "key": "carModelActualTestQuickCharge",
# }, {
# "id": 0,
# # "name": "<span class='hs_kw22_configpl'></span><span class='hs_kw10_configpl'></span><span class='hs_kw40_configpl'></span>(小时)",
# "name": "实测慢充时间(小时)",
# "namePattern": "</span>\(小时\)$",
# "key": "carModelActualTestSlowCharge",
}
]
isItemZero = itemId == 0
print("isItemZero=%s" % isItemZero)
foundSpan = re.search("<span", itemName)
print("foundSpan=%s" % foundSpan)
isSpecialName = bool(foundSpan)
print("isSpecialName=%s" % isSpecialName)
isNotSpecialName = not isSpecialName
print("isNotSpecialName=%s" % isNotSpecialName)
if not isItemZero:
for eachMapDict in paramIdNameKeyMapDict:
eachItemId = eachMapDict["id"]
if eachItemId == itemId:
foundMapDict = eachMapDict
break
if not foundMapDict:
if itemName and isNotSpecialName:
for eachMapDict in paramIdNameKeyMapDict:
eachItemName = eachMapDict["name"]
if eachItemName == itemName:
foundMapDict = eachMapDict
break
if not foundMapDict:
if (isItemZero and isSpecialName):
for eachMapDict in paramIdNameKeyMapDict:
if "namePattern" in eachMapDict:
eachItemNamePattern = eachMapDict["namePattern"]
print("eachItemNamePattern=%s" % eachItemNamePattern)
foundMatchName = re.search(eachItemNamePattern, itemName)
print("foundMatchName=%s" % foundMatchName)
if foundMatchName:
foundMapDict = eachMapDict
break
print("foundMapDict=%s from id=%s, name=%s" % (foundMapDict, itemId, itemName))
return foundMapDict目前跑出来的数据,没有出错:

数据中发现:
能源类型 除了之前的:
- 汽油
- 纯电动
- 插电式混合动力
- 油电混合
之前还有:
- 柴油
- 汽油+48V轻混系统
- 增程式
以及:
【未解决】汽车之家车型车系数据:能源类型是空白的车型
另外看了看几个特殊的:
- 汽油+48V轻混系统
- 增程式
只有 东风风光的几款车型,比如:
所以可以忽略。
另外好像还有个问题:
【未解决】汽车之家车型车系数据:carBrandId是空