056a6c60创建于 2020年12月31日历史提交
文件最后提交记录最后更新时间
三方库构建源码5 年前
三方库构建源码5 年前
README.txt

Introduction

This module contains the implementation patch and installation scripts for "MADlib", which support AI in DB in openGauss. Currently, openGauss supports machine learning algorithm in MADlib17.

Before Installation

Madlib relies on plpython2. So, we must compile GaussDB with python.

Check Python Environment

python version must >= 2.7.12, we highly recommend 2.7.17 or 2.7.18

  1. if your python version >= 2.7.12, you can install yum install python-devel, others, please goto step 2.

  2. install python2.7.18 by yourself with '--enable-shared' option when configure.

    ./configure --prefix=YOUR_XXX --enable-shared --enable-unicode=ucs4
    make -sj;make install -sj
    

Re-compile Database

Compile openGauss, with '--with-python' option when configure.

Installation MADlib

Compile

  1. patch MADlib.

    tar -zxf apache-madlib-1.17.0-src.tar.gz
    cp madlib.patch apache-madlib-1.17.0-src
    cd apache-madlib-1.17.0-src/
    patch -p1 < madlib.patch
    
  2. compile MADlib: MADlib will download dependent software while compiling.

    1. If your machine can connect to Internet. you can run:
    ./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER}            # your install folder
    -DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ 
    -DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ 
    -DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ 
    -DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/
    make && make install -sj
    
    1. If your machine cannot download dependcy online. you must download Dependent Software by yourself.
    	./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER}            # your install folder
    	-DPYXB_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/PyXB-1.2.6.tar.gz               # change to your local folder 
    	-DEIGEN_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/eigen-branches-3.2.tar.gz      # change to your local folder 
    	-DBOOST_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/boost_1_61_0.tar.gz            # change to your local folder 
    	-DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ 
    	-DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ 
    	-DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ 
    	-DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/
    	make && make install -sj
    
  3. Finished

Install MADlib

install python package

some algorithm depends on python package.

 pip install numpy==1.14.5
 pip install pandas==0.24.2
 pip install scipy

gsql connects to your database.

create database <YOUR_DATABASE> dbcompatibility='B';
cd {YOUR_MADLIB_INSTALL_FOLDER}
./madpack -s <YOUR_SCHEMA> -p opengauss -c <DATABASE_USERNAME>@127.0.0.1:<PORT>/<YOUR_DATABASE> install

Additional software

  1. if you use facebook prophet
pip install pystan
pip install holidays==0.9.8
pip install fbprophet==0.3.post2
  1. if you use xgboost
pip install xgboost
pip install scikit-learn

Primary/secondary

Your need to copy python and {YOUR_MADLIB_INSTALL_FOLDER} to the same path in secondary machine.