当前位置：移动技术网 > IT编程>脚本编程>Python > 荐【项目实战】数据爬虫＋数据清洗 + 数据可视化+开源代码啦

荐【项目实战】数据爬虫＋数据清洗 + 数据可视化+开源代码啦

2020年07月20日 | 移动技术网IT编程 | 我要评论

文章目录

写在前面：

自己已经创建公众号啦~
AI算法交流+开源数据汇总+私房数据及标注数据共享+自己实践项目开源
欢迎大家关注：DeepAI 视界

爬虫：链接网二手房（以贵阳市为例）
对应的数据可视化：

同时赠送给大家另一个版本的：

爬虫：链家网：柳州市

数据可视化：（优化版）

话不多说，自己上篇爬虫博客写的还行，10000的阅读量以及360多的收藏和100多的点赞评论，自己一个一个发送代码发了一个多月，现在正式开源！
附上原文传送门：https://blog.csdn.net/qq_46098574/article/details/106048756

写在前面：

自己已经创建公众号啦~

AI算法交流+开源数据汇总+私房数据及标注数据共享+自己实践项目开源

欢迎大家关注：DeepAI 视界

在这里插入图片描述

展示一下：
在这里插入图片描述

爬虫：链接网二手房（以贵阳市为例）

"""
爬虫
"""
import requests
from lxml import etree
import xlwt
import xlrd
import csv
import pandas as pd
import time
import re
class LanjiaSpider:
    def __init__(self):
        self.url = 'https://wh.lianjia.com/ershoufang/ronganxian/pg{}/'
        self.headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}

    def get_response_spider(self, url_str):  # 发送请求
        get_response = requests.get(self.url, headers=self.headers)
        time.sleep(2)
        response = get_response.content.decode()
        html = etree.HTML(response)
        return html
    def get_content_html(self, html):  # 使xpath获取数据
        self.houseInfo = html.xpath('//div[@class="houseInfo"]/text()')
        self.title = html.xpath('//div[@class="title"]/a/text()')
        self.positionInfo = html.xpath('//div[@class="positionInfo"]/a/text()')
        self.totalPrice = html.xpath('//div[@class="totalPrice"]/span/text()')
        self.unitPrice = html.xpath('//div[@class="unitPrice"]/span/text()')
        self.followInfo = html.xpath('//div[@class="followInfo"]/text()')
        self.tag = html.xpath('//div[@class="tag"]/span/text()')
        # print(title)
        # return houseInfo,title,positionInfo,totalPrice,unitPrice,followInfo,tag

    def xpath_houseInfo(self):
        #print(self.houseInfo)
        #print(type(self.houseInfo))
        # df = pd.DataFrame({"houseInfo": self.houseInfo,"tite":self.title,"positionInfo":self.positionInfo,"totaPrice":self.totalPrice,"unitPrice":self.unitPrice,"followInfo":self.followInfo,"tag":self.tag})
        # df=pd.DataFrame({"houseInfo": self.houseInfo,"tite":self.title})
        # df.to_excel(r'C:\Users\wy\Desktop\sublime\链家\pand3.xlsx')
        # a=len(self.houseInfo)
        for i in range(len(self.houseInfo)):
            # print(i)
            # yield i
            # print(type(self.houseInfo))
            yield self.houseInfo[i]

    def qingxi_data_houseInfo(self):  # 清洗数据

        self.xpath_houseInfo()
        self.xpath_title()
        self.xpath_positionInfo()
        self.xpath_totalPrice()
        self.xpath_unitPrice()
        self.xpath_followInfo()
        self.xpath_tag()
        get_houseInfo = self.xpath_houseInfo()
        get_title = self.xpath_title()
        get_positionInfo=self.xpath_positionInfo()
        get_totalPrice = self.xpath_totalPrice()
        get_unitPrice = self.xpath_unitPrice()
        get_followInfo=self.xpath_followInfo()
        get_tag=self.xpath_tag()
        i = 1

        while True:
            data_houseInfo= next(get_houseInfo)
            data_title=next(get_title)
            data_positionInfo=next(get_positionInfo)
            data_totalPrice=next(get_totalPrice)
            data_unitPrice=next(get_unitPrice)
            data_followInfo=next(get_followInfo)
            data_tag=next(get_tag)

            with open("a.csv", "a", newline="", encoding="utf-8-sig") as f:
                # fieldnames = ['houseInfo', 'title', 'positionInfo', 'totalPrice/万元', 'unitPrice', 'followInfo', 'tag']
                # writer = csv.DictWriter(f, fieldnames=fieldnames)  # 写入表头
                # writer.writeheader()
                writer = csv.DictWriter(f, fieldnames=fieldnames)  # 写入表头

                list_1 = ['houseInfo', 'title', 'positionInfo', 'totalPrice/万元', 'unitPrice', 'followInfo', 'tag']
                list_2 = [data_houseInfo,data_title,data_positionInfo,data_totalPrice,data_unitPrice,data_followInfo,data_tag]
                list_3 = dict(zip(list_1, list_2))
                writer.writerow(list_3)
                print("写入第"+str(i)+"行数据")
            i += 1
            if i > len(self.houseInfo):
                break

    def xpath_title(self):
        for i in range(len(self.title)):
            yield self.title[i]

    def xpath_positionInfo(self):
        for i in range(len(self.positionInfo)):
            yield self.positionInfo[i]

    def xpath_totalPrice(self):
        for i in range(len(self.totalPrice)):
            yield self.totalPrice[i]
    def xpath_unitPrice(self):
        for i in range(len(self.unitPrice)):
            yield self.unitPrice[i]
    def xpath_followInfo(self):
        for i in range(len(self.followInfo)):
            yield self.followInfo[i]
    def xpath_tag(self):
        for i in range(len(self.tag)):
            yield self.tag[i]
    def run(self):
        i = 1
        while True:
            url_str = self.url.format(i)  # 构造请求url
            html = self.get_response_spider(url_str)
            self.get_content_html(html)
            self.qingxi_data_houseInfo()

            i += 1
            if i == 1:  # 不包括57页
                break


if __name__ == "__main__":
    with open("a.csv", "a", newline="", encoding="utf-8-sig") as f:
        fieldnames = ['houseInfo', 'title', 'positionInfo', 'totalPrice/万元', 'unitPrice', 'followInfo', 'tag']
        writer = csv.DictWriter(f, fieldnames=fieldnames)  # 写入表头
        writer.writeheader()
    lanjia = LanjiaSpider()
    lanjia.run()

对应的数据可视化：

"""
数据分析及可视化
"""
import pandas as pd
from pyecharts.charts import Line, Bar
import numpy as np
from pyecharts.globals import ThemeType
from pyecharts.charts import Pie
from pyecharts import options as opts


places = ['lianjia_BaiYunQu', 'lianjia_GuanShanHuQu', 'lianjia_HuaXiQu', 'lianjia_NanMingQu', 'lianjia_WuDangQu', 'lianjia_YunYanQu']
place = ['白云区', '观山湖区', '花溪区', '南明区', '乌当区', '云岩区']
avgs = []  # 房价均值
median = []  # 房价中位数
favourate_avg = []  # 房价收藏人数均值
favourate_median = []  # 房价收藏人数中位数
houseidfo = ['2室1厅', '3室1厅', '2室2厅', '3室2厅', '其他']  # 房型定义
houseidfos = ['2.1', '3.1', '2.2', '3.2']
sum_house = [0,  0, 0, 0, 0]  # 各房型数量
price = []  # 房价
fav = []  # 收藏人数
type = []
area = []  # 房间面积
def avg(name):
    df = pd.read_csv(str(name)+'.csv', encoding='utf-8')
    pattern = '\d+'
    df['totalPrice/万元'] = df['totalPrice/万元'].str.findall(pattern)
    df['followInfo'] = df['followInfo'].str.findall(pattern)
    df['houseInfo'] = df['houseInfo'].str.findall(pattern)
    sum_houses = [0, 0, 0, 0, 0]
    # print(sum_house)
    avg_work_year = []
    medians = []
    favourates = []
    k = 0
    k1 = 0
    k3 = 0
    k4 = 0

    for i in range(len(df)):
        if (i + 1) % 2 == 0:
            continue
        else:
            if len(df['totalPrice/万元'][i]) == 2:
                avg_work_year.append(','.join(df['totalPrice/万元'][i]).replace(',', '.'))
                medians.append(float(','.join(df['totalPrice/万元'][i]).replace(',', '.')))
                price.append(','.join(df['totalPrice/万元'][i]).replace(',', '.'))
            if len(df['followInfo'][i]) ==2:
                favourates.append(int(','.join(df['followInfo'][i][:1])))
                fav.append(int(','.join(df['followInfo'][i][:1])))
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 2.1:
                k +=1
                sum_houses[0] =k
                type.append(2.1)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 3.1:
                k1 +=1
                sum_houses[1] =k1
                type.append(3.1)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 2.2:
                k3 +=1
                sum_houses[2] =k3
                type.append(2.2)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 3.2:
                k4 +=1
                sum_houses[3] =k4
                type.append(3.2)
            else:
                k4 +=1
                sum_houses[4] = k4
                type.append('other')
            area.append(float(','.join(df['houseInfo'][i][2:4]).replace(',', '.')))
    sum_house[0] =sum_houses[0]
    sum_house[1] = sum_houses[1]
    sum_house[2] = sum_houses[2]
    sum_house[3] = sum_houses[3]
    sum_house[4] = sum_houses[4]

    favourates.sort()
    favourate_median.append(int(np.median(favourates)))
    medians.sort()
    median.append(np.median(medians))
    # price = avg_work_year
    b = len(avg_work_year)
    b1= len(favourates)

    sum = 0
    sum1 = 0
    for i in avg_work_year:
        sum = sum+float(i)
    avgs.append(round(sum/b, 2))
    for i in favourates:
        sum1 = sum1+float(i)
    favourate_avg.append(round(int(sum1/b1), 2))

for i in places:
    avg(i)

print("各区平均房价", avgs)
print('各房型的出售总数:', sum_house)
print("房间面积", area)
"""
[280, 56, 504, 1676, 1680]
[392, 112, 448, 1679, 1680]
[224, 0, 616, 3359, 3360]
[448, 112, 280, 1679, 1680]
[504, 0, 336, 1680, 1679]
[224, 56, 168, 1680, 1670]


[66.17, 65.6, 76.04, 78.94, 62.06, 74.37] 
[68.8, 67.8, 79.8, 70.8, 57.6, 78.8]
[6, 6, 9, 4, 4, 4] [5, 4, 3, 2, 3, 2]
"""
# print(median)
# print(favourate_avg,favourate_median)
line = Line()
line.add_xaxis(place)
line.add_yaxis("贵阳各地房价平均值(万元)", avgs)
line.add_yaxis("贵阳各地房价中位数值（万元）", median)
# line.render("predict_line.html")

def bar() -> Bar:
    c = (
        Bar({"theme": ThemeType.MACARONS})
            .add_xaxis(place)
            .add_yaxis("平均值", avgs)
            .add_yaxis("中位数", median)
            .set_global_opts(
            title_opts={"text": "贵阳各地房价(万元)"}
        )
    )
    return c
bar().render("predict_bar.html")
# print(sum_house)
def bar() -> Bar:
    c = (
        Bar({"theme": ThemeType.MACARONS})
            .add_xaxis(houseidfo)
            .add_yaxis(place[0], [280, 56, 504, 1676, 1680])
            .add_yaxis(place[1], [392, 112, 448, 1679, 1680])
            .add_yaxis(place[2], [224, 0, 616, 3359, 3360])
            .add_yaxis(place[3], [448, 112, 280, 1679, 1680])
            .add_yaxis(place[4], [504, 0, 336, 1680, 1679])
            .add_yaxis(place[-1], sum_house)
            # .add_yaxis("中位数", favourate_median)
            .set_global_opts(
            title_opts={"text": "贵阳各地房型\n数量"}
        )
    )
    return c
# bar().render("house_bar.html")
line = Line()
line.add_xaxis(place)
line.add_yaxis("贵阳各地房子平均面积\n(平米)", area)
line.render("Area_line.html")



list_num = favourate_avg
attr = place
# print(zip(attr, list_num))
s = [list(z) for z in zip(attr, list_num)]
c = (Pie().add("", s).set_global_opts(title_opts=opts.TitleOpts(title="贵阳市各区楼房\n平均收藏人数"))
     .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
     )
c.render("pie_avg.html")

list_num = favourate_median
attr = place
# print(zip(attr, list_num))
s = [list(z) for z in zip(attr, list_num)]
c = (Pie().add("", s).set_global_opts(title_opts=opts.TitleOpts(title="贵阳市各区楼房\n收藏人数中位数"))
     .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
     )
c.render("pie_median.html")

from pyecharts import options as  opts
from pyecharts.charts import Scatter3D
from pyecharts.faker import Faker


price=[float(i)/1 for i in price]
# print(price)
# types=list(map(mapfunc,df.house_type.values))
# type = [224, 56, 168, 1680, 1670]
data = []
# print(fav,type)
# for j in range(len(type)):
#     for k in range(len(fav)):
for j in range(100):
    for k in range(100):
        for i in range(500):
            try:
                data.append([type[j], favourate_avg[k],price[i]])
            except:
                continue
# print(data)
scatter = (
     Scatter3D(init_opts=opts.InitOpts(width='900px', height='600px'))  # 初始化
          .add("", data,
               grid3d_opts=opts.Grid3DOpts(
                    width=300, depth=300, rotate_speed=300, is_rotate=True,
               ),)

          # 设置全局配置项
          .set_global_opts(
          title_opts=opts.TitleOpts(title="房型——关注度——价格\n三维关系图"),  # 添加标题
          visualmap_opts=opts.VisualMapOpts(
               max_=100,  # 最大值
               pos_top=200,  # visualMap 组件离容器上侧的距离
               range_color=Faker.visual_color  # 颜色映射
          )
     )
          # .render("3D散点图.html")
)
print('数据分析和可视化结束，左边点开~')

同时赠送给大家另一个版本的：

爬虫：链家网：柳州市

# -*- coding: utf-8 -*-
import scrapy


import requests
from lxml import etree
import xlwt
import xlrd
import csv
import pandas as pd
import time

class LanjiaSpider:
    def __init__(self):
        self.url = 'https://liuzhou.lianjia.com/ershoufang/yufengqu/pg{}/'
        self.headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}

    def get_response_spider(self, url_str):  # 发送请求
        get_response = requests.get(self.url, headers=self.headers)
        time.sleep(2)
        response = get_response.content.decode()
        html = etree.HTML(response)
        return html
    def get_content_html(self, html):  # 使xpath获取数据
        self.houseInfo = html.xpath('//div[@class="houseInfo"]/text()')
        self.title = html.xpath('//div[@class="title"]/a/text()')
        self.positionInfo = html.xpath('//div[@class="positionInfo"]/a/text()')
        self.totalPrice = html.xpath('//div[@class="totalPrice"]/span/text()')
        self.unitPrice = html.xpath('//div[@class="unitPrice"]/span/text()')
        self.followInfo = html.xpath('//div[@class="followInfo"]/text()')
        self.tag = html.xpath('//div[@class="tag"]/span/text()')
        # print(title)
        # return houseInfo,title,positionInfo,totalPrice,unitPrice,followInfo,tag

    def xpath_houseInfo(self):
        #print(self.houseInfo)
        #print(type(self.houseInfo))
        # df = pd.DataFrame({"houseInfo": self.houseInfo,"tite":self.title,"positionInfo":self.positionInfo,"totaPrice":self.totalPrice,"unitPrice":self.unitPrice,"followInfo":self.followInfo,"tag":self.tag})
        # df=pd.DataFrame({"houseInfo": self.houseInfo,"tite":self.title})
        # df.to_excel(r'C:\Users\wy\Desktop\sublime\链家\pand3.xlsx')
        # a=len(self.houseInfo)
        for i in range(len(self.houseInfo)):
            # print(i)
            # yield i
            # print(type(self.houseInfo))
            yield self.houseInfo[i]

    def qingxi_data_houseInfo(self):  # 清洗数据

        self.xpath_houseInfo()
        self.xpath_title()
        self.xpath_positionInfo()
        self.xpath_totalPrice()
        self.xpath_unitPrice()
        self.xpath_followInfo()
        self.xpath_tag()
        get_houseInfo = self.xpath_houseInfo()
        get_title = self.xpath_title()
        get_positionInfo=self.xpath_positionInfo()
        get_totalPrice = self.xpath_totalPrice()
        get_unitPrice = self.xpath_unitPrice()
        get_followInfo=self.xpath_followInfo()
        get_tag=self.xpath_tag()
        i = 1
        while True:
            data_houseInfo= next(get_houseInfo)
            data_title=next(get_title)
            data_positionInfo=next(get_positionInfo)
            data_totalPrice=next(get_totalPrice)
            data_unitPrice=next(get_unitPrice)
            data_followInfo=next(get_followInfo)
            data_tag=next(get_tag)

            with open("yufengqu.csv", "a", newline="", encoding="utf-8-sig") as f:
                fieldnames = ['houseInfo', 'title', 'positionInfo', 'totalPrice/万元', 'unitPrice', 'followInfo', 'tag']
                writer = csv.DictWriter(f, fieldnames=fieldnames)  # 写入表头
                writer.writeheader()
                list_1 = ['houseInfo', 'title', 'positionInfo', 'totalPrice/万元', 'unitPrice', 'followInfo', 'tag']
                list_2 = [data_houseInfo,data_title,data_positionInfo,data_totalPrice,data_unitPrice,data_followInfo,data_tag]
                list_3 = dict(zip(list_1, list_2))
                writer.writerow(list_3)
                print("写入第"+str(i)+"行数据")
            i += 1
            if i > len(self.houseInfo):
                break

    def xpath_title(self):
        for i in range(len(self.title)):
            yield self.title[i]

    def xpath_positionInfo(self):
        for i in range(len(self.positionInfo)):
            yield self.positionInfo[i]

    def xpath_totalPrice(self):
        for i in range(len(self.totalPrice)):
            yield self.totalPrice[i]
    def xpath_unitPrice(self):
        for i in range(len(self.unitPrice)):
            yield self.unitPrice[i]
    def xpath_followInfo(self):
        for i in range(len(self.followInfo)):
            yield self.followInfo[i]
    def xpath_tag(self):
        for i in range(len(self.tag)):
            yield self.tag[i]
    def run(self):
        i = 1
        while True:
            url_str = self.url.format(i)  # 构造请求url
            html = self.get_response_spider(url_str)
            self.get_content_html(html)
            self.qingxi_data_houseInfo()

            i += 1
            if i == 100:  # 不包括100页
                break


# if __name__ == "__main__":
#     lanjia = LanjiaSpider()
#     lanjia.run()

class MyspiderSpider(scrapy.Spider):
    name = 'myspider'
    allowed_domains = ['https://wh.lianjia.com/ershoufang/jianghan/']
    start_urls = ['https://wh.lianjia.com/ershoufang/jianghan//']

    def parse(self, response):
        print('爬取ing....')
        lanjia = LanjiaSpider()
        lanjia.run()

数据可视化：（优化版）

"""
数据分析及可视化
auuthor: 周小夏
"""
import pandas as pd
from pyecharts.charts import Line, Bar
import numpy as np
from pyecharts.globals import ThemeType
from pyecharts.charts import Pie
from pyecharts import options as opts


places = ['chengzhongqu', 'liubeiqu', 'liuchengxian', 'liujiangqu', 'liunanqu', 'yufengqu']
place = ['城中区', '柳北区', '柳城县', '柳江区', '柳南区', '鱼峰区']
avgs = []  # 房价均值
median = []  # 房价中位数
favourate_avg = []  # 房价收藏人数均值
favourate_median = []  # 房价收藏人数中位数
houseidfo = ['2室1厅', '3室1厅', '2室2厅', '3室2厅', '其他']  # 房型定义
houseidfos = ['2.1', '3.1', '2.2', '3.2']
sum_house = [0,  0, 0, 0, 0]  # 各房型数量
sum_houses = []
price = []  # 房价均值
unitprice = []  # 单价
fav = []  # 收藏人数
type = []
area = []  # 房间


def avg(name):
    df = pd.read_csv('./spiders/' + str(name)+'.csv', encoding='utf-8')
    pattern = '\d+'
    df['totalPrice/万元'] = df['totalPrice/万元'].str.findall(pattern)
    df['followInfo'] = df['followInfo'].str.findall(pattern)
    df['houseInfo'] = df['houseInfo'].str.findall(pattern)
    df['unitPrice'] = df['unitPrice'].str.findall(pattern)

    sum_houses = [0, 0, 0, 0, 0]
    # print(sum_house)
    avg_work_year = []
    areas = []
    unit_avg = []
    medians = []
    favourates = []
    k = 0
    k1 = 0
    k3 = 0
    k4 = 0

    for i in range(len(df)):
        if (i + 1) % 2 == 0:
            continue
        else:
            if len(df['unitPrice'][i]) >= 0:
                unit_avg.append(','.join(df['unitPrice'][i]).replace(',', '.'))
            if len(df['totalPrice/万元'][i]) >= 0:
                avg_work_year.append(','.join(df['totalPrice/万元'][i]).replace(',', '.'))
                medians.append(float(','.join(df['totalPrice/万元'][i]).replace(',', '.'))*100)
                price.append(','.join(df['totalPrice/万元'][i]).replace(',', '.'))
            if len(df['followInfo'][i]) ==2:
                favourates.append(int(','.join(df['followInfo'][i][:1])))
                fav.append(int(','.join(df['followInfo'][i][:1])))
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 2.1:
                k +=1
                sum_houses[0] =k
                type.append(2.1)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 3.1:
                k1 +=1
                sum_houses[1] =k1
                type.append(3.1)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 2.2:
                k3 +=1
                sum_houses[2] =k3
                type.append(2.2)
            if float(','.join(df['houseInfo'][i][:2]).replace(',', '.')) == 3.2:
                k4 +=1
                sum_houses[3] =k4
                type.append(3.2)
            else:
                k4 +=1
                sum_houses[4] = k4
                type.append('other')
            areas.append(float(','.join(df['houseInfo'][i][2:4]).replace(',', '.')))
    sum_house[0] =sum_houses[0]
    sum_house[1] = sum_houses[1]
    sum_house[2] = sum_houses[2]
    sum_house[3] = sum_houses[3]
    sum_house[4] = sum_houses[4]
    sum_house.append(sum_house[0])
    sum_house.append(sum_house[1])
    sum_house.append(sum_house[2])
    sum_house.append(sum_house[3])
    sum_house.append(sum_house[4])
    # print(sum_houses)
    favourates.sort()
    favourate_median.append(int(np.median(favourates)))
    medians.sort()
    median.append(np.median(medians))
    # price = avg_work_year
    b = len(avg_work_year)*100
    b1= len(favourates)
    b2 = len(unit_avg)
    b4 = len(areas)*100
    sum = 0
    sum1 = 0
    for i in unit_avg:
        sum = sum+float(i)
    unitprice.append(round(sum/b2, 2))
    for i in areas:
        sum = sum+float(i)
    area.append(round(sum/b4, 2))
    for i in avg_work_year:
        sum = sum+float(i)
    avgs.append(round(sum/b, 2))
    for i in favourates:
        sum1 = sum1+float(i)
    favourate_avg.append(round(int(sum1/b1), 2))

for i in places:
    avg(i)

print("各区平均房价", avgs)
print('各房型的出售总数:', sum_house)
print("房间面积", area)
print("房价单价", unitprice)

a = []
for i in median:
    a.append(i/100)
# print(median)
# print(favourate_avg,favourate_median)
line = Line()
line.add_xaxis(place)
line.add_yaxis("柳州市各地房价平均值(万元)", avgs)
line.add_yaxis("柳州市各地房价中位数值（万元）", a)
line.render("predict_line.html")

def bar() -> Bar:
    c = (
        Bar({"theme": ThemeType.MACARONS})
            .add_xaxis(place)
            .add_yaxis("平均值", unitprice)
            .set_global_opts(
            title_opts={"text": "柳州市各地房价单价(元)"}
        )
    )
    return c

bar().render("unit_prices.html")
def bar() -> Bar:
    c = (
        Bar({"theme": ThemeType.MACARONS})
            .add_xaxis(place)
            .add_yaxis("平均值", avgs)
            .add_yaxis("中位数", a)
            .set_global_opts(
            title_opts={"text": "柳州市各地房价(万元)"}
        )
    )
    return c
bar().render("predict_bar.html")
# print(sum_house)
def bar() -> Bar:
    c = (
        Bar({"theme": ThemeType.MACARONS})
            .add_xaxis(houseidfo)
            .add_yaxis(place[0], sum_house[0:5])
            .add_yaxis(place[1], sum_house[5:10])
            .add_yaxis(place[2], sum_house[10:15])
            .add_yaxis(place[3], sum_house[15:20])
            .add_yaxis(place[4], sum_house[20:25])
            .add_yaxis(place[-1], sum_house[25:30])
            # .add_yaxis("中位数", favourate_median)
            .set_global_opts(
            title_opts={"text": "柳州市各地房型\n数量"}
        )
    )
    return c
bar().render("house_bar.html")

line = Line()
line.add_xaxis(place)
line.add_yaxis("柳州市各地房子平均面积\n(平米)", area)
line.render("Area_line.html")



list_num = favourate_avg
attr = place
# print(zip(attr, list_num))
s = [list(z) for z in zip(attr, list_num)]
c = (Pie().add("", s).set_global_opts(title_opts=opts.TitleOpts(title="柳州市各区楼房\n平均收藏人数"))
     .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
     )
c.render("pie_avg.html")

list_num = favourate_median
attr = place
# print(zip(attr, list_num))
s = [list(z) for z in zip(attr, list_num)]
c = (Pie().add("", s).set_global_opts(title_opts=opts.TitleOpts(title="柳州市各区楼房\n收藏人数中位数"))
     .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
     )
c.render("pie_median.html")

from pyecharts import options as  opts
from pyecharts.charts import Scatter3D
from pyecharts.faker import Faker


line = Line()
line.add_xaxis(place)
line.add_yaxis("房间面积\n(平米)", area)
line.add_yaxis("房价\n(/万元)", avgs)
line.render("price1_line.html")


price=[float(i)/1 for i in price]
# print(price)
# types=list(map(mapfunc,df.house_type.values))
# type = [224, 56, 168, 1680, 1670]
data = []
# print(fav,type)
# for j in range(len(type)):
#     for k in range(len(fav)):
for j in range(100):
    for k in range(100):
        for i in range(500):
            try:
                data.append([type[j], favourate_avg[k],price[i]])
            except:
                continue
# print(data)
scatter = (
     Scatter3D(init_opts=opts.InitOpts(width='900px', height='600px'))  # 初始化
          .add("", data,
               grid3d_opts=opts.Grid3DOpts(
                    width=300, depth=300, rotate_speed=300, is_rotate=True,
               ),)

          # 设置全局配置项
          .set_global_opts(
          title_opts=opts.TitleOpts(title="房型——关注度——价格\n三维关系图"),  # 添加标题
          visualmap_opts=opts.VisualMapOpts(
               max_=300,  # 最大值
               pos_top=200,  # visualMap 组件离容器上侧的距离
               range_color=Faker.visual_color  # 颜色映射
          )
     )
          .render("3D散点图.html")
)
print('数据分析和可视化结束，左边点开~')

最后，别忘记了关注公众号~

分享最新算法！

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-CiQYvfTG-1594979733395)(D:\CSDN\pic\WeChat Image_20200716151357.jpg)]

本文地址：https://blog.csdn.net/qq_46098574/article/details/107414402

您可能感兴趣的文章:

如对本文有疑问，点击进行留言回复！！

通过实例了解Python异常处理机制底层实现

要了解try except异常处理的用法，简单来说，当位于 try 块中的程序执行出现异常时，会将该种异常捕获，同时找到对应的 except 块处理该异常，那么... [阅读全文]
[内附完整源码和文档] 基于python的新闻检索系统

1 系统介绍1.1 系统需求新闻检索系统：定向采集不少于 4 个中文社会新闻网站或频道，实现这些网站新闻信息及评... [阅读全文]
是他们，让云原生落地千万家企业！

199年，「伯俊」正式成立，从成立之初就一直致力于零售相关的IT服务，并在零售行业有很深的积累，是规模最大、客户... [阅读全文]
Django之初学习

Django之初学习Django简介MVC和MTV模型创建第一个Django项目安装Django创建第一个项目运... [阅读全文]
Python笔记-UiSelector混合定位

这里有如下几种方式完成功能。第一种是使用：布局找child：是这样找的：在android.widget.Rela... [阅读全文]
Python笔记-UiSelector中resourceId定位方式

这里以豌豆荚为例，通过资源ID进行定位：从中可以看到对应的下标为0，1，2，3，4；这里是使用resourceI... [阅读全文]
【3Dtiles】3Dmax模型处理为gltf和3dtiles，包含LOD效果

3dmax模型处理为3dtiles [阅读全文]
荐用Django全栈开发——08. 使用AdminLTE开发前端登录页面

大家好，这是皮爷给大家带来的最新的学习Python能干啥？之Django教程，从零开始，到最后成功部署上线的项目... [阅读全文]
2020.07.20 Python study Day 1

2020.07.20 Python study Day 1介绍学科明确学习目的和学习要求课堂纪律安装所需要的... [阅读全文]
slice、splice、split 的区别

1.1 slice功能：截取字符串（或数组），返回新的字符串（或数组）方法：（使用字符串举例，数组作用类似）st... [阅读全文]