当前位置：移动技术网 > IT编程>脚本编程>Python > Python读取本地html文件，获取其中表格内容

Python读取本地html文件，获取其中表格内容

2020年07月18日 | 移动技术网IT编程 | 我要评论

以个人成绩网页页面为例：
在这里插入图片描述
右键查看源代码：

右键另存为单独的html文件，然后代码读取并处理

import re

f = open("GP.html","r",encoding='utf-8')
html = f.read()

table = re.findall(r'<table(.*?)</table>', html, re.S)#查找html中table之间的内容
nowtable = table[0]#前两个表格为成绩信息
nowtable = nowtable.replace('\t','')#将空格换行等去除
nowtable = nowtable.replace('\n','')
nowtable = nowtable.replace(' ','')
nowtable = nowtable.replace('&nbsp','')
td0 = re.findall(r'<tdclass="center">(.*?)</td>', nowtable, re.S)#成绩想关的信息都在tdclass="center"td之间
print("主修课程信息为:\n",td0)
nowtable = table[1]
nowtable = nowtable.replace('\t','')
nowtable = nowtable.replace('\n','')
nowtable = nowtable.replace(' ','')
nowtable = nowtable.replace('&nbsp','')
td1 = re.findall(r'<tdclass="center">(.*?)</td>', nowtable, re.S)
print("选修课信息为：\n",td1)
print("选修课信息第一个值为：\n",td1[0])

结果：
在这里插入图片描述
如果想要计算GPA，字符转换为对应的数值进行计算就行了

本文地址：https://blog.csdn.net/qq_37813206/article/details/107380221

您可能感兴趣的文章:

如对本文有疑问，点击进行留言回复！！

从C语言中读取Python 类文件对象

问题你要写c扩展来读取来自任何python类文件对象中的数据（比如普通文件、stringio对象等）。解决方案要读取一个类文件对象的数据，你需要重复调用 rea... [阅读全文]
Python3爬虫关于代理池的维护详解

我们在上一节了解了代理的设置方法，利用代理我们可以解决目标网站封 ip 的问题，而在网上又有大量公开的免费代理，其中有一部分可以拿来使用，或者我们也可以购买付费... [阅读全文]
Python如何对齐字符串

问题你想通过某种对齐方式来格式化字符串解决方案对于基本的字符串对齐操作，可以使用字符串的 ljust() , rjust() 和 center() 方法。比如：... [阅读全文]
python实现从无序的链表中删除重复项

python实现从无序的链表中删除重复项题目描述:给定一个没有排序的链表，去掉其重复项，并保留原顺序，例如链表... [阅读全文]
python实现Canny与Hough算法

任务说明：编写一个钱币定位系统，其不仅能够检测出输入图像中各个钱币的边缘，同时，还能给出各个钱币的圆心坐标与半径... [阅读全文]
DP-LeetCode221. 最大正方形

1、题目描述https://leetcode-cn.com/problems/maximal-square/在一... [阅读全文]
听课笔记--Python数据分析--Numpy基础及基本应用

'''@Author: Liang@LastEditors: Liang@Date: 2020-07-26 19... [阅读全文]
评价类模型——Tposis法

Tposis法学习笔记适用的范围操作方法第一步 > 将原始矩阵正向化第二步>正向化矩阵标准化第三步&... [阅读全文]
python的platform模块的使用

platform是用来获取操作系统的信息的模块，具体见文档[root@VM_0_9_centos ~]# pyt... [阅读全文]
Python-定时任务APScheduler中两种调度器的区别

概述两种调度器BackgroundScheduler和BlockingScheduler的区别举例说明APSch... [阅读全文]

网友评论


验证码：

Python读取本地html文件，获取其中表格内容

2020年07月18日 | 移动技术网IT编程 | 我要评论

您可能感兴趣的文章:

相关文章:

网友评论